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(57) Zusammenfassung: Die vorliegende Erfindung betrifft Verfahren zur Herstellung trans formierter pflanzlicher Zellen oder Or- 
ganismen durch Transformation einer Population pflanzlicher Zellen, die mindestens ein Markerprotein mit einem fur diese direkt 
oder indirekt toxischen EfTekt umfasst, mit mindestens einer zu insertierenden Nuklemsauresequenz in Kombination mit mindes- 
tens einer Verbindung - bevorzugt einem DNA-Konstrukt - befahigt zur Verminderung der Expression, Menge, Aktivitat und/oder 
Funktion dcs Markerproleins, wobci die txansfonnieTtcn pDanzlichcn Zellen infolgc der Wirkung bcsaglcr Verbindung gegenuber 
nicht-transformicrtcn Zellen cincn Wachstumsvortcil haben. 
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INVERSION OF THE NEGATIVE-SELECTIVE EFFECT OF NEGATIVE 
MARKER PROTEINS USING SELECTION METHODS 

Description 

The present invention relates to processes for preparing trans- 
formed plant cells or organisms by transforming a population of 
plant cells which comprises at least one marker protein having a 
direct or indirect toxic effect for said population, with at 
least one nucleic acid sequence to be inserted in combination 
10 with at least one compound, preferably a DNA construct, capable 
of reducing the expression, amount, activity and/or function of 
the marker protein, with the transformed plant cells having a 
growth advantage over nontrans formed cells, due to the action of 
said compound. 

Genetic material is successfully introduced usually only into a 
very limited number of target cells of a population. This neces- 
sitates the distinction and isolation of successfully transformed 
from nontrans formed cells, a process which is referred to as 
selection. Traditionally, the selection is carried out by way of 
a "positive" selection, wherein the transformed cell is enabled 
to grow and to survive, whereas the untrans formed cell is inhib- 
ited in its growth or destroyed (McCormick et al. (1986) Plant 
20 cell Reports 5:81-84). A positive selection of this kind is usu- 
ally implemented by genes which code for a resistance to a bio- 
cide (e.g. a herbicide such as phosphinothricin, glyphosate or 
bromoxynil, a metabolism inhibitor such as 2-deoxyglucose 6-phos- 
phate (WO 98/45456) or an antibiotic such as tetracycline, ampi- 
cillin, kanamycin, G 418, neomycin, bleomycin or hygromycin) . 
Such genes are also referred to as positive selection markers. 
The positive selection marker is coupled (physically or by means 
of cotrans formation) to the nucleic acid sequence to be 
introduced into the cell genome and is then introduced into the 
cell. Subsequently, the cells are cultured on a medium under the 
appropriate selection pressure (for example in the presence of an 
appropriate antibiotic or herbicide), whereby the transformed 
cells, owing to the required resistance to said selection pres- 
30 sure, have a growth/ survival advantage and can thus be selected. 
Positive selection markers which may be mentioned by way of exam- 
ple are: 
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phosphinothricin acetyltransferases (PAT) (also: Bialophos® 
resistance; bar) acetylate the free amino group of the gluta 
mine synthase inhibitor phosphinothricin (PPT) and thus 
achieve a detoxification (de Block et al . (1987) EMBO J 



/ 
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6:2513-2518; Vickers JE et al. (1996) Plant Mol Biol Reporter 
14:363-368; Thompson CJ et al . (1987) EMBO J 6:2519-2523). 

5-enolpyruvylshikimate 3-phosphate synthases (EPSPS) impart a 
5 resistance to the unselective herbicide Glyphosat® (N-(phos- 

phonomethyl) glycine; Steinrucken HC et al. (1980) Biochem 
Biophys Res Commun 94:1207—1212; Levin JG and Sprinson DB 
(1964) J Biol Chem 239:1142-1150; Cole DJ (1985) Mode of ac- 
tion of glyphosate; A literature analysis, p. 48—74. In: 

10 Grossbard E and Atkinson D (eds.) The herbicide glyphosate. 

Buttersworths , Boston.). Glyphosate-tolerant EPSPS variants 
for use as selection markers have been described (Padgette SR 
et al- (1996). New weed control opportunities: development of 
soybeans with a Roundup Ready** gene. In: Herbicide Resistant 

15 Crops (Duke SO, ed. ) , pp. 53-84. CRC Press, Boca Raton, FI*; 

Saroha MK and Malik VS (1998) J Plant Biochemistry and Bio- 
technology 7:65—72; Padgette SR et al.(1995) Crop Science 
35(5) : 1451-1461; US 5,510,471; US 5,776,760; US 5,864,425; US 
5,633,435; US 5,627,061; US 5,463,175; EP-A 0 218 571). 

20 

- neomycin phosphotransferases constantly impart a resistance 
to aminoglycoside antibiotics such as neomycin, G418, hygro- 
mycin, paromomycin or kanamycin by reducing the inhibiting 
action thereof by means of a phosphorylation reaction (Beck 
et al. (1982) Gene 19:327-336). 

- 2-deoxyglucose 6-phosphate phosphatases impart a resistance 
to 2-deoxyglucose (EP-A 0 807 836; Randez-Gil et al. (1995) 

30 Yeast 11:1233-1240; Sanz et al. (1994) Yeast 10:1195-1202). 

- acetolactate synthases impart a resistance to imidazolinone/ 
sulfonylurea herbicides (e.g. imazzamox, imazapyr, imazaquin, 
imazethapyr, amidosulf oron, azimsulfuron, chlorimuron ethyl, 

35 chlorsulfuron; Sathasivan K et al. (1990) Nucleic Acids Res 

18(8) :2188) - 



40 



In addition, resistance genes to the antibiotics hygromycin (hy- 
gromycin phosphotransferases), chloramphenicol (chloramphenicol 
acetyltransf erase) , tetracycline, streptomycin, zeocine and ampi- 
cillin (fl-lactamase gene; Datta N, Richmond MH.(1966) Biochem J 
98(l):204-9) have been described. 

45 Genes such as isopentenyl transferase (ipt) from Agrobacterium 
tumefaciens ( strain:P022 ) (GenBank Acc . No-: AB025109) may like- 
wise be used as selection markers. The ipt gene is a key enzyme 
of cytokine biosynthesis. Its overexpression facilitates the re- 
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generation of plants (e.g. selection on cytokine-f ree medium) 
(Ebimima H et al. (2000) Proc Natl Acad Sci USA 94:2117-2121; 
Ebinuma H et al. (2 000) Selection of Marker-free transgenic 
plants using the oncogenes (ipt, rol A, B, C) of Agrobacterium as 
5 selectable markers, In Molecular Biology of Woody Plants. Kluwer 
Academic Publishers). The disadvantages here are, firstly, the 
fact that the selection disadvantage is based on usually subtle 
differences in cell proliferation and, secondly, the fact that 
the plant acquires unwanted properties (gall tumor formation) due 
10 to transformation with an oncogene. 

EP-A 0 601 092 describes various other positive selection mark- 
ers. Examples which may be mentioned are: (^-glucuronidase (in con- 
nection with, for example, cytokinine glucuronide) , mannose 
15 6-phosphate isomerase (in connection with mannose), UDP-galactose 
4-epimerase (in connection with galactose r for example). 



Negative selection markers are used for selecting organisms in 
20 which marker sequences have been successfully deleted (Koprek T 
et al. (1999) Plant J 19 ( 6 ): 719-726) . In the presence of a nega- 
tive selection marker, the corresponding cell is destroyed or ex- 
periences a growth disadvantage. Negative selection involves, for 
example, the negative selection marker introduced into the plant 
converting a compound which otherwise has no action disadvanta- 
geous to the plant into a compound with a disadvantageous (i.e- 
toxic) action. Examples of negative selection markers include: 
thymidine kinase (TK) , for example of Herpes simplex virus (Wig- 
ler et al. (1977) Cell 11:223), cellular adenine phosphoribosyl 
30 transferase (APRT) (Wigler et al. (1979) Proc Natl Acad Sci USA 
76:1373) , hypoxanthine phosphoribosyl transferase (HPRT) (Jolly 
et al. 11983) Proc Natl Acad Sci USA 80:477), diphtheria toxin A 
fragment (DT-A), the bacterial xanthine-guanine 

phosphoribosyl transferase (gpt; Besnard et al. (1987) Mol. Cell. 

35 Biol. 7:4139; Mzoz and Moolten (1993) Human Gene Therapy 

4:589-595), the codA gene product coding for a cytosine deaminase 
(Gleave AP et al. (1999) Plant Mol Biol. 4 0 ( 2 ) : 223-35 ; Perera RJ 
et al. (1993) Plant Mol Biol 23(4): 793-799; Stougaard J; (1993) 
Plant J 3:755-761; EP-Al 595 873), the cytochrome P450 gene (Ko- 

a „ prek et al. (1999) Plant J 16:719-726), genes coding for a ha- 
loalkane dehalogenase (Naested H (1999) Plant J 18:571-576), the 
iaaH gene (Sundaresan V et al. (1995) Genes & Development 
9:1797-1810) or the tms2 gene (Fedoroff NV & Smith DI» (1993) 
Plant J 3: 273-289). The negative selection markers are usually 

45 employed in combination with "prodrugs" or "pro-toxins", com- 
pounds which are converted into toxins by the activity of the 
selection marker. 
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5-Methyl-thioribose (MTR) kinase is an enzyme whose enzymic activ- 
ity in plants, bacteria and protozoa, but not in mammals, has 
been described. The enzyme may convert an MTR analog (5- 
(triromethyl)thioribose) as a "subversive substrate" of the me- 
5 thionine salvage pathway via an unstable intermediate to give the 
toxic compound carbothionyl difluoride. 

Said selection systems have various disadvantages. The introduced 
selection marker (e.g. resistance to antibiotics) is justified 
10 only during transformation and selection but is later a usually 
unnecessary and often also undesired protein product. This may be 
disadvantageous for reasons of consumer acceptance and/or approv- 
al as a food and/or feed product. Another disadvantage in this 
connection is the fact that the selection marker used for selec- 
15 tion is usually genetically coupled to the nucleic acid sequence 
to be inserted into the genome and cannot be decoupled by segre- 
gation during propagation or crossing. Usually, deletion of the 
marker sequence is required, making additional steps necessary, 
in addition, biotechnological studies require in numerous cases 
20 multiple transformation with various gene constructs. Here, each 
transformation step requires a new selection marker unless the 
previously used marker is to be laboriously deleted first. This, 
however, necessitates a broad palette of well-functioning selec- 
tion markers which are not available for most plant organisms. 

25 

Consequently, it was the object of the invention to provide novel 
selection processes for selecting transformed plant cells and or- 
ganisms, which, if possible, no longer have the disadvantages of 
30 the available systems. This object is achieved by the present in- 
vention. 

The invention firstly relates to a process for preparing trans- 
formed plant cells or organisms, which process comprises the fol- 
35 lowing steps i 

a) transforming a population of plant cells, with the cells 
of said population containing at least one marker protein 
capable of causing directly or indirectly a toxic effect 
for said population, with at least one nucleic acid se- 
quence to be inserted in combination with at least one 
compound capable of reducing the expression, amount, ac- 
tivity and/or function of at least one marker protein, 
and 



40 



45 



b) selecting transformed plant cells whose genome contains 

said nucleic acid sequence and which have a growth advan- 
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tage over nontxansf ormed cells r due to the action of said 
compound, from said population of plant cells, the selec- 
tion being carried out under conditions under which the 
marker protein can exert its toxic effect on the non- 
5 transformed cells. 



In a preferred embodiment, the marker protein is a protein capa- 
ble of converting directly or indirectly a substance X which is 
nontoxic for said population of plant cells into a substance Y 
which is toxic for said population. In this case, the process of 
the invention preferably comprises the following steps r 



a) transforming the population of plant cells with at least 
one nucleic acid sequence to be inserted in combination 
with at least one compound capable of reducing the ex- 
pression, amount, activity and/or function of at least 
one marker protein, and 



20 b) treating said population of plant cells with the sub- 

stance X at a concentration which causes a toxic effect 
for nontransf ormed cells, due to the conversion by the 
marker protein, and 

25 c) selecting transformed plant cells whose genome contains 

said inserted nucleic acid sequence and which have a 
growth advantage over nontransf ormed cells, due to the 
action of said compound, from said population of plant 
cells, the selection being carried out under conditions 

30 under which the marker protein can exert its toxic effect 

on the nontransf ormed cells. 



The nontoxic substance X is preferably a substance which does not 
naturally occur in plant cells or organisms or occurs naturally 
therein only at a concentration which can essentially not cause 
any toxic effect. In the scope of the process of the invention, 
preference is given to applying the nontoxic substance X exoge- 
nously, for example via the medium or the growth substrate. 

The term "compound capable of reducing the expression, amount, 
activity and/or function of at least one marker protein" is to be 
understood broadly and generally means any compounds which cause, 
directly or indirectly, alone or in cooperation with other fac- 
tors, a reduction in the amount of protein, amount of RNA r gene 
activity, protein activity or protein function of at least one 
marker protein. Said compounds are also referred to under the ge- 
neric term "anti-marker protein'' compounds. The term "anti-marker 
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protein" compound includes in particular, but is not limited to, 
the nucleic acid sequences, ribonucleic acid sequences, double- 
stranded ribonucleic acid sequences, antisense ribonucleic acid 
sequences, expression cassettes, peptides, proteins or other fac 
5 tors used in the preferred embodiments within the scope of the 
process of the invention - 



10 



15 



In a preferred embodiment, "anti-marker protein" compound means a 
DNA construct comprising 

a) at least one expression cassette suitable for expressing 
a ribonucleic acid sequence and/or, if appropriate, a 
protein, said nucleic acid sequence and/or protein being 
capable of reducing the expression, amount, activity and/ 
or function of the marker protein, or 



b) at least one sequence which causes a partial or complete 
deletion or inversion of the sequence coding for said 

2o marker protein and thus enables the expression, amount, 

activity and/or function of the marker protein to be re- 
duced, and also, if appropriate, further functional ele- 
ments which facilitate and/or promote said deletion or 
inversion, or 

25 

c) at least one sequence which causes an insertion into the 
sequence coding for said marker protein and thus enables 
the expression, amount, activity and/or function of the 
marker protein to be reduced, and also, if appropriate, 

30 further functional elements which facilitate and/or pro- 

mote said insertion. 



The process of the invention stops the negative-selective action 
of the marker protein. To this extent, an "anti-marker protein'' 
compound acts directly (e.g. via inactivation by means of inser- 
tion into the gene coding for the marker protein) or indirectly 
(e.g. by means of the ribonucleic acid sequence expressed via the 
expression cassette and/or, where appropriate, of the protein 
translated therefrom) as a positive selection marker. Hence, the 
selection system of the invention is to be referred to as a "re- 
verse selection system", since it "reverts" the negative-selec- 
tive action of the marker protein. 

The process of the invention means a drastic broadening of the 
repertoire of positive selection processes for selecting trans- 
formed plant cells. 



PF 53790 



CA 02493364 2005-01-21 



7 

Another advantage is the fact that in a particular, preferred em- 
bodiment (e.g. via the action of a double-stranded or antisense 
RNA) f it is possible to implement the selection effect without 
expressing a foreign protein (see below). 

5 

It is also advantageous that the marker protein used indirectly 
for selection (e.g. the negative selection marker) is not coupled 
genetically to the nucleic acid sequence to be inserted into the 
genome. In contrast to the otherwise customary selection pro- 
® cesses , the marker protein , if it is a transgene, may be removed 
by simple segregation in the course of subsequent propagation or 
crossing. 



15 "Plant cell" means within the scope of the present invention any 
type of cell which has been derived from a plant organism or is 
present therein. In this context, the term includes by way of ex- 
ample protoplasts, callus or cell cultures, microspores, pollen, 
cells in the form of tissues such as leaves, meristem, flowers, 

20 embryos, roots, etc. Included are, in particular, all of those 
cells and cell populations which are suitable as target tissues 
for a transformation. 



In this context, "plant organism" comprises any organism capable 
25 of photosynthesis and also the cells, tissues, parts or propaga- 
tion material (such as seeds or fruits) derived therefrom. In— 
eluded within the scope of the invention are all genera and spe- 
cies of higher and lower plants of the plant kingdom- Preference 
is given to annual, perennial, monocotyledonous and dicotyledon- 
30 ous plants and also gymnosperms . 



"Plant" means within the scope of the invention all genera and 
species of higher and lower plants of the plant kingdom. The term 
includes the mature plants, seed, shoots and seedlings, and also 
parts, propagation material (for example tubers, seeds or 
fruits), plant organs, tissues, protoplasts, callus and other 
cultures, for example cell cultures, derived therefrom, and also 
any other types of groupings of plant cells to give functional or 
structural units. Mature plants means plants at any developmental 
stage beyond that of the seedling. Seedling means a young imma- 
ture plant at an early developmental stage. "Plant" comprises all 
annual and perennial monocotyledonous and dicotyledonous plants 
and includes by way of example but not by limitation those of the 
genera Cucurbita, Rosa, Vitis, Juglans, Fragaria, IiOtus, Medica- 
go, Onobrychis, Trifolium, Trigonella, Vigna r Citrus, Linum, Ge- 
ranium, Manihot, Caucus, Arabidopsis, Brassica, Raphanus, Sina- 
pis f Atropa, Capsicum, Datura, Hyoscyamus, Ly copers icon, 
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Nicotiana, Solarium, Petunia, Digitalis, Majorana, cichorium. He 
lianthus, Lactuca, Bromus, Asparagus, Antirrhinum , Heterocallis, 
Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, 
Salpiglossis, Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lo- 
5 lium, Oryza, Zea, Avena, Hordeum, Secale, Triticum, Sorghum, Pi- 
cea and Populus. 



Preference is given to plants of the following plant families: 
Amaranthaceae, Aster aceae, Brassicaceae, Carophyllaceae, Chenopo- 
10 diaceae, Compos itae, Cruciferae, Cucurbitaceae, Labiatae, Legum!- 
nosae, Papilionoideae, Liliaceae, Linaceae, Malvaceae, Rosaceae, 
Rubiaceae, Saxif ragaceae, Scrophulariaceae , Solanacea, Sterculia- 
ceae, Tetragoniacea, Theaceae, Umbellif erae. 



15 



20 



Preferred monocotyledonous plants are selected in particular from 
the monocotyledonous crop plants such as, for example, those in 
the family of Gramineae such as alfalfa, rice, corn, wheat or 
other cereal species such as barley, millet, rye, triticale or 
oats and also from sugar cane and all grass species . 

Preferred dicotyledonous plants are selected in particular from 
the dicotyledonous crop plants such as, for example, 

- Asteraceae, such as sunflower, tagetes or calendula and others, 

- Compositae, in particular the genus Lactuca, very especially 
the species sativa (lettuce) and others, 



Cruciferae, especially the genus Brassica, very especially the 
species napus (oilseed rape), campestris (beet), oleracea cv 
Tastie (cabbage), oleracea cv Snowball Y (cauliflower) and ol- 
eracea cv Emperor (broccoli) and other cabbage species; and the 
genus Arabidopsis, very especially the species thaliana, and 
cress or canola and others. 



- Cucurbitaceae, such as melon, pumpkin/ squash or zucchini and 
others , 

- Leguminosae, especially the genus Glycine, very especially the 
species max (soybean) and alfalfa, pea, bean plant or peanut, 
and others 



- Rubiaceae, preferably the subclass Lamiidae, such as, for exam 
pie, Coffea arabica or Coffea liberica (coffee bush) and oth- 
ers, 
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- Solanaceae, in particular the genus Lycopersicon, very espe- 
cially the species esculentum (tomato) , the genus Solanura, very 
especially the species tuberosum (potato) and melongena (egg- 
plant)/ and the genus Capsicum, very especially the species an- 
nuum (pepper) and tobacco and others, 

- Sterculiaceae, preferably the subclass Dilleniidae, such as, 
for example, Theobroma cacao (cacao tree) and others, 

- Theaceae, preferably the subclass Dilleniidae, such as, for ex- 
ample, Camellia sinensis or Thea sinensis (tea shrub) and oth- 
ers, 

15 - Umbelliferae, especially the genus Daucus (very especially the 
species carota (carrot)) and Apium (very especially the species 
graveolens dulce (celery)) and others, 



10 



20 



and linseed, cotton, hemp, flax, cucumber, spinach, carrot, sugar 
beet and the various tree, nut and grapevine species, in particu- 
lar banana and kiwi. 



Plant organisms for the purposes of the invention are furthermore 
25 other photosynthetically active capable organisms such as, for 
example, algae, cyanobacteria and mosses. Preferred algae are 
green algae such as, for example, algae of the genus Haematococ- 
cus, Phaedactylum tricornatum, Volvox or Dunaliella. Particular 
preference is given to Synechocystis . 

30 



Particular preference is given to the group of plants, consisting 
of wheat, oats, millet, barley, rye, corn, rice, buckwheat, sor- 
ghum, triticale, spelt, linseed, sugar cane, oilseed rape, cress, 
Arabidopsis, cabbage species, soybean, alfalfa, pea, bean plants, 
3 5 peanut, potato, tobacco, tomato, eggplant, paprika, sunflower, 
tagetes, lettuce, calendula, melon, pumpkin and zucchini. 

Most preference is given to 

40 

a) plants suitable for producing oil, such as, for example, 

oilseed rape, sunflower, sesame, safflower (Carthamus tincto- 
rius), olive tree, soybean, corn, peanut, ricinus, oil palm, 
wheat, cacao tree or various nut species such as, for exam- 
pie, walnut, coconut or almond* Among these, particular pref- 
erence is in turn given to dicotyledonous plants, in particu- 
lar oilseed rape, soybean and sunflower- 
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b) plants suitable for producing starch, such as corn, wheat or 
potato, for example, 

c) plants which are utilized as food and/or feedstuff and/or as 
5 useful plants and in which a resistance to pathogens would be 

advantageous, such as barley, rye, rice, potato, cotton, flax 
or linseed, for example. 



10 



d) plants which may be suitable for producing fine chemicals 
such as, for example, vitamins and/or carotenoids, such as 
oilseed rape, for example. 



"Population of plant cells" means any group of plant cells, which 
15 may be subjected within the scope of the present invention to a 
transformation and from which transgenic plant cells transformed 
by the process of the invention may be obtained and isolated. In 
this context, said population may also be, for example, a plant 
tissue, organ or a cell culture, etc. Said population may com- 
2o prise by way of example but not by limitation an isolated zygote, 
an isolated immature embryo, embryogenic callus, plant or else 
various flower tissues (both in vitro and in vivo). 

-"Genome" means the entirety of genetic information of a plant 
25 cell and comprises both genetic information of the nucleus and 

that of the plastids (e.g. chloroplasts ) and mitochondria. Howev- 
er, genome preferably means the genetic information of the 
nucleus (for example of the nuclear chromosomes). 

30 "Selection" means identifying and/or isolating successfully 

transformed plant cells from a population of nontrans formed cells 
by using the process of the invention. This does not necessarily 
require that the selection be carried out directly with the 
transformed cells immediately after transformation. It is also 

35 possible to carry out the selection only at a later time, even 
with a later generation of the plant organisms (or cells, tis- 
sues, organs or propagation material derived therefrom) resulting 
from the transformation. Thus it is possible, for example, to 
transform Arabidopsis plants directly using, for example, the 

40 vacuum infiltration method (Clough S & Bent A (1998) Plant J 
16(6) :735-43/ Bechtold N et al- (1993) CR Acad Sci Paris 
1144 (2) :204-212) , which subsequently produce transgenic seeds 
which may then be subjected to selection - 

The fact that the nucleic acid sequence to be inserted is trans- 
formed "in combination with" the "anti-marker protein" compound 
(e.g. a DNA construct) is to be understood broadly and means that 
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at least one nucleic acid sequence to be inserted and at least 
one "anti-marker protein" compound are functionally coupled to 
one another so that the presence of the "anti-marker protein" 
compound in the plant cell, and of the selection advantage re- 
5 lated thereto , indicates the parallel presence of the inserted 
nucleic acid sequence as likely. The nucleic acid sequence to be 
inserted and the "anti-marker protein" compound (e.g. a DNA 
construct) here may be, preferably but not necessarily, part of a 
single nucleic acid construct (e.g. a transformation construct or 

10 transformation vector), i.e. be present physicochemically coupled 
via a covalent bond. However, they may also be jointly introduced 
separately, for example in the course of a cotrans formation, and 
exert their function within the scope of the process of the in- 
vention also in this way. in the case of the "anti-marker protein 

15 compound" acting via expressing an RNA (e.g. an antisense RNA or 
double-stranded RNA) or being such an RNA, "in combination" may 
also include those embodiments in which said RNA and the RNA ex- 
pressed by the nucleic acid sequence inserted into the genome 
form an RNA strand. 

20 "Nontoxic substance X" generally means substances which, compared 
to their reaction product Y, under otherwise identical condi- 
tions, have a reduced, preferably an essentially lacking biologi- 
cal, activity, preferably toxicity. In this context, the toxicity 
of substance Y is at least twice as high as that of substance X, 

25 preferably at least five times as high, particularly preferably 
at least ten times as high, very particularly preferably at least 
twenty times as high, most preferably at least one hundred times 
as high. "Identical conditions" here means that all conditions 
are kept the same, apart from the different substances X and Y. 

30 Accordingly, identical molar concentrations of X and Y are used, 
with the medium, temperature, type of organism and density of or- 
ganism, etc. being the same. The substance X may be converted to 
the substance Y in various ways, for example by hydrolysis, 
deamination, hydrolysis , dephosphorylation, phosphorylation, 

35 oxidation or any other type of activation, metabolization or 

conversion. The substance X may be, by way of example but not by 
limitation, the inactive precursor or derivative of a plant 
growth regulator or herbicide. 

40 

"Toxicity" or "toxic effect" means a measurable, negative influ- 
ence on the physiology of the plant or of the plant cell and may 
comprise here symptoms such as, for example, but not limited 
thereto, a reduced or disrupted growth, a reduced or disrupted 
45 rate of photosynthesis, a reduced or disrupted cell division, a 
reduced or disrupted regeneration of a complete plant from cell 
culture or callus, etc. 
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The plant cells successfully transformed by means of the process 
of the invention may, to put it differently, have a growth advan- 
tage or selection advantage over the nontransf ormed cells of the 
same starting population under the influence of the substance 
5 "X". Growth or selection advantage is to be understood here 
broadly and means, for example, the fact that said transformed 
plant cells are capable of forming shoots and/or can be regener- 
ated to give complete plants, whereas the nontransf ormed cells 
can do this only with a marked delay, if at all. 

0 

The term of "marker protein" is to be understood broadly and gen- 
erally means all of those proteins which are capable of 

i) exerting per se a toxic effect on the plant or plant cell, or 



ii) converting directly or indirectly a nontoxic substance X into 
a substance Y which is toxic for the plant or plant cell. 

20 

In this context, the marker protein may be a plant-intrinsic, en- 
dogenous gene or else a transgene from a different organism. Pre- 
ferably, the marker protein itself has no essential function for 
the organism including the marker protein. If the marker protein 
25 per se exerts a toxic effect, then it will preferably be ex- 
pressed, for example, under an inducible promoter rather than 
constitut ively . 

Preferably, however, the marker protein converts directly or in- 
30 directly a nontoxic substance X into a substance Y which is toxic 
for the plant or plant cell. Particularly preferred marker pro- 
teins are the "negative selection markers" as are used, for exam- 
ple, in the course of targeted deletions from the genome. 

35 Examples of marker proteins which may be mentioned but which are 
not limiting are: 



(a) cytosine deaminases (CodA or CDase), with preference being 

given to using as the nontoxic substance X substances such as 
5-f luorocytosine (5-FC). Cytosine deaminases catalyze the 
deamination of cytosine to give uracil (Kilstrup M et al. 
(1989) J Bacterid 171:2124-2127; Anderson L et al . (1989) 
Arch Microbiol 152:115-118). Bacteria and fungi which have 
CDase activity convert 5-FC to the toxic metabolite ("Y") 
5-f luorouracil (5-FU) (Polak A & Scholer HJ (1975) Chemother- 
apy (Basel) 21:113-130). 5-FC itself has low toxicity 
(Bennett JE, in Goodman and Gilman: the Pharmacological Basis 
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of Therapeutics. 8th ed. r eds. Gilman AG et al. (Pergamon 
Press, New York) pp. 1165-1181). However, 5-FU has a highly 
cytotoxic effect, since it is subsequently metabolized to 
fluoro-UTP (FUTP) and fluoro-dUMP (FdUMP) and thus inhibits 
5 RNA and DNA synthesis (Calabrisi P & Chabner BA in Goodman 

and Gilman: the Pharmacological Basis of Therapeutics. 8th 
ed., eds. Gilman AG et al. (Pergamon Press, New York) pp. 
1209-1263); Damon LE et al. (1989) Pharmac Ther 43:155-189). 

10 Cells of higher plants and mammalian cells have no signifi- 

cant CDase activity and cannot deaminase 5-FC (Polak A et al. 
(1976) Chemotherapy 22:137-153? Koechlin BA et al. (1966) 
Biochemical Pharmacology 15:434-446). In this respect, the 
CDase is introduced as a transgene (e.g. in the form of a 
15 transgenic expression cassette) into plant organisms in the 

course of the process of the invention. Corresponding trans- 
genic plant cells or organisms are then used as masterplants 
as starting material. Appropriate CDase sequences, transgenic 
plant organisms and the process of carrying out negative 
selection processes using, for example, 5-FC as nontoxic sub- 
stance X, are known to the skilled worker (WO 93/01281; US 
5 f 358,866; Gleave AP et al . (1999) Plant Mol Biol 
40(2) :223-35; Perera RJ et al . (1993) Plant Mol Biol 
23(4) :793-799; Stougaard J (1993) Plant J 3:755-761); EP-A1 
595 837; Mullen CA et al. (1992) Proc Natl Acad Sci USA 
89(l):33-37; Kobayashi T et al . (1995) Jpn J Genet 
70(3) : 409-422; Schlaman HRM & Hooykaas PFF (1997) Plant J 
11:1377-1385; Xiaohui Wang H et al. (2001) Gene 272(1-2): 
249-255; Koprek T et al. (1999) Plant J 19 ( 6 ) : 719-72 6 ; Gleave 
AP et al- (1999) Plant Mol Biol 40 ( 2 ): 223-235 ; Gallego ME 
(1999) Plant Mol Biol 39(l):83-93; Salomon s & Puchta H 
(1998) EMBO J 17(20) :6086-6095; Thykjaer T et al. (1997) 
Plant Mol Biol 35(4) : 523-530; Serino G (1997) Plant J 
12(3) :697-701; Risseeuw E (1997) Plant J 11 ( 4 ): 717-728 ; Blanc 
35 V et al. (1996) Biochimie 78 ( 6 ): 511-517 ; Corneille S et al, 

(2001) Plant J 27:171-178). Cytosine deaminases and the genes 
coding therefor may be obtained from a multiplicity of organ- 
isms, preferably microorganisms such as, for example, the 
fungi Cryptococcus neoformans, Candida albicans, Torulopsis 
40 glabrata, Sporothrix schenckii, Aspergillus, Cladosporium 

and Phialophora (JE Bennett, Chapter 50: Antifungal Agents, 
in Goodman and Gilman 's the Pharmacological Basis of Thera- 
peutics 8th ed., A.G. Gilman, ed. , Pergamon Press, New York, 
1990) and the bacteria E.coli and Salmonella typhimurium 
45 (Andersen L et al. (1989) Arch Microbiol 152:115-118). 



30 
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The sequences, materials and processes disclosed in the con- 
text of said publications are hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: S56903 r and to the modified codA sequences de- 
scribed in EP-A1 595 873, which make expression in eukaryotes 
possible. Preference is given here to nucleic acid sequences 
coding for polypeptides according to SEQ ID NO: 2 or, prefer- 
ably, 4, in particular the sequences according to SEQ ID NO z 
1 or, preferably, 3. 

(b) cytochrome P-450 enzymes, in particular the bacterial cytoch- 
rome P-450 SU1 gene product (CYP105A1) from Streptomyces gri- 
seolus (strain ATCC 117 96), with preference being given to 
using as nontoxic substance X substances such as the pro 
sulfonylurea herbicide R7402 ( 2-methylethyl-2 -3-dihydro- 
N-[ ( 4 , 6-dimethoxypyrimidin-2-yl ) aminocarbonyl ] -1 , 2-benzoiso- 
thiazole-7 -sulfonamide 1 , 1-dioxide) . Corresponding sequences 
and the process of carrying out negative selection processes 
using, for example, R7402 as nontoxic substance X are known 
to the skilled worker (O'Keefe DP et al. (19 94) Plant Physiol 
105:473-482; Tissier AF et al. (1999) Plant Cell 
11:1841-1852; Koprek T et al. (1999) Plant J 19 ( 6 ) : 7 19-72 6 ; 
O'Keefe DP (1991) Biochemistry 30(2):447-55). The sequences, 
materials and processes disclosed in the context of said pub- 
lications are hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: M32238. Preference is further given to nucleic 
acid sequences coding for the polypeptide according to SEQ ID 
NO: 6, in particular the sequence according to SEQ ID NO: 5. 

(c) indoleacetic acid hydrolases such as, for example, Agrobac- 
teriura tumefaciens, tms2 gene product, with preference being 
given to using as nontoxic substance X substances such as 
auxin amide compounds or naphtha leneacet amide (NAM) (with NAM 
being converted to naphthaleneacetic acid, a phytotoxic sub- 
stance). Corresponding sequences and the process of carrying 
out negative selection processes using, for example, NAM as 
nontoxic substance X are known to the skilled worker 
(Fedoroff NV & Smith DL (1993) Plant J 3:273-289; Upadhyaya 
NM et al. (2000) Plant Mol Biol Rep 18:227-223; Depicker AG 
et al. (1988) Plant Cell rep 104:1067-1071; Karlin-Neumannn 
GA et al. (1991) Plant Cell 3:573-582; Sundaresan V et al. 
(1995) Gene Develop 9:1797-1810; Cecchini E et al. (1998) Mu- 
tat Res 401(1-2) : 199-206; Zubko E et al. (2000) Nat Biotech- 
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nol 18:442-445). The sequences, materials and processes dis- 
closed in the context of said publications are hereby 
explicitly referred to. 



10 



Particular preference is given to sequences according to Gen- 
Bank Acc. No: NC_003308 ( Protein_id="NP_536128 . 1 ) , AE009419, 
AB016260 (Protein_id=' r BAA87807 .1) and NC002147. Preference is 
further given to nucleic acid sequences coding for polypep- 
tides according to SEQ ID NO: 8 or 10 , in particular the se- 
quences according to SEQ ID NO: 7 or 9 . 



(d) haloalkane dehalogenases (dhlA gene product), for example 

from Xanthobacter autotropicus GJ10. The dehalogenase hydro- 
15 lyzes dihaloalkanes such as 1 , 2-dichloroethane { DCE ) to give 

halogenated alcohols and inorganic halides (Naested H et al. 
(1999) Plant J 18(5)571-576; Janssen DB et al. (1994) Annu 
Rev Microbiol 48: 163-191; Janssen DB (1989) J Bacteriol 
171( 12) :6791-9) . The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
itly referred to. 



20 



Particular preference is given to sequences according to Gen- 
25 Bank Acc. No: M2 6950. Preference is further given to nucleic 

acid sequences coding for the polypeptide according to SEQ ID 
NO: 12 , in particular the sequence according to SEQ ID NO: 
11. 



30 



35 



40 



45 



(e) thymidine kinases (TK), in particular viral TKs from viruses 
such as Herpes simplex virus, SV40, cytomegalovirus. 
Varicella zoster virus, in particular the TK of Herpes sim- 
plex virus type 1 (TK HSV-1), with preference being given to 
using as nontoxic substance X substances such as Acyclovir, 
Ganciclovir or 1 , 2-deoxy-2-f luoro-p-D-arabinofuranosil-5-io- 
douracil (FIAU) . Corresponding sequences and the process of 
carrying out negative selection processes using, for example. 
Acyclovir, Ganciclovir or FIAU as nontoxic substance X are 
known to the skilled worker (Czako M & Marton L (1994) Plant 
Physiol 104:1067-1071; Wigler M et al. (1977) Cell 
11(1) :223-232; McKnight SL et al- (1980) Nucl Acids Res 
8(24) : 594 9-59 64; McKnight SL et al. (1980) Nucl Acids Res 
8(24) :5931-5948; Preston et al. (1981) J Virol 38 ( 2 ): 593-605 ; 
Wagner et al . (1981) Proc Natl Acad Sci USA 78 ( 3 ): 14 4 1-1445 ? 
St. Clair et al.(1987) Antimicrob Agents Chemother 
31(6) : 844-849) . The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
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itly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: J02224, V00470 and V00467 . Preference is also 
given to nucleic acid sequences coding for polypeptides ac- 
cording to SEQ ID NO: 14 or 16 , in particular the sequences 
according to SEQ ID NO: 13 or 15, 

f) guanine phosphoribosyl transferases r hypoxanthine phosphori- 
bosyl transferases or xanthine guanine phosphoribosyl trans- 
ferases, with preference being given to using as nontoxic 
substance X substances such as 6-thioxanthine or allopurinol. 
Preference is given to guanine phosphoribosyl transferases 
(gpt) r for example from E. Coli (Besnard et al. (1987) Mol 
Cell Biol 7:4139; Mzoz and Moolten (1993) Human Gene Therapy 
4:589-595; Ono et al. (1997) Hum Gene Ther 8 ( 17 ) : 2 043-55 ) , 
hypoxanthine phosphoribosyl transferases (HPRT; Jolly et al. 
(1983) Proc Natl Acad Sci USA 80:477; Fonwick "The HGPRT Sys- 
tem", pp. 333-373, M . Gottesman (ed.), Molecular Cell Genet- 
ics r John Wiley and Sons r New York, 1985), xanthine guanine 
phosphoribosyl transferases, for example from Toxoplasma gon- 
dii (Knoll LJ et al.(1998) Mol Cell Biol 18 ( 2 ) : 807-8 14 ; Don- 
ald RG et al. (1996) J Biol Chem 271 (24 ): 14010-14019 ) . The 
sequences, materials and processes disclosed in the context 
of said publications are hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: U10247 (Toxoplasma gondii HXGPRT) , M13422 
(E. coli gpt) and X00221 (E. coli gpt) . Preference is also 
given to nucleic acid sequences coding for polypeptides ac- 
cording to SEQ ID NO: 18, 20 or 22, in particular the se- 
quences according to SEQ ID NO: 17, 19 or 21. 

(g) purine nucleoside phosphorylases (PNP? DeoD gene product), 

for example from E. coli, with preference being given to us- 
ing as nontoxic substance X substances such as 6-methylpurine 
deoxyribonucleoside. Corresponding sequences and the process 
of carrying out negative selection processes using, for exam- 
ple, 6-methylpurine deoxyribonucleoside as nontoxic substance 
x are known to the skilled worker (Sorscher EJ et al. (1994) 
Gene Therapy 1:233-238). The sequences, materials and pro- 
cesses disclosed in the context of said publications are 
hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank acc. No: M60917. Preference is also given to nucleic 
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acid sequences coding for the polypeptide according to SEQ ID 
NO: 2 4 , in particular the sequence according to SEQ ID NO: 
23. 

h) phosphonate monoester hydrolases which convert inactive ester 
derivatives of the herbicide glyphosate (e.g. glycerylglypho- 
sate) into the active form of the herbicide. Corresponding 
sequences and the process of carrying out negative selection 
processes using, for example, glycerylglyphosate are known to 
the skilled worker (US 5,254,801; Dotson SB et al. (1996) 
Plant J 10(2) :383-392; Dotson SB et al. (1996) J Biol Chem 
271(42); 25754-25761). The sequences, materials and processes 
disclosed in the context of said publications are hereby ex- 
plicitly referred to. 



Particular preference is given to sequences according to Gen- 
Bank Acc. No: U44852. Preference is also given to nucleic 
acid sequences coding for the polypeptide according to SEQ ID 
20 NO: 2 6, in particular the sequence according to SEQ ID NO: 

25. 

(i) aux-1 and, preferably, aux-2 gene products, for example of 

the Ti plasmids of Agrobacterium strains such as A.rhizogenes 
25 or A.tumef aciens (Beclin C et al. (1993) Transgenics Res 

2:4855); Gaudin V, Jouanin L. . (1995) Plant Mol Biol. 
28(1) :123-36. 



30 



The activity of the two enzymes causes the plant cell to pro- 
duce indoleacetamide (IAA). Aux— 1 encodes an indoleacet amide 
synthase (IAMS) and converts tryptophan into indoleacetamide 
(VanOnckelen et al. (1986) FEBS Lett. 198: 357-360). Aux-2 
encodes the enzyme indoleacetamide hydrolase (IAMH) and con- 
35 verts indoleacetamide, a substance without phytohormone ac- 

tivity, into the active auxin indoleacetic acid { Inze D et 
al. (1984) Mol Gen Genet 194:265-274; Tomashow et al. (1984) 
Proc Natl Acad Sci USA 81:5071-5075; Schroder et al. (1984) 
Eur J Biochem 138:387-391). The enzyme IAMH may also hydro- 
lyze a number of indoleamide substrates such as, for example, 
naphthaleneacetamide, the latter being converted into the 
plant growth regulator naphthaleneacetic acid (NAA) . The use 
of the IAMH gene as a negative selection marker is described, 
for example, in US 5,180,873. Corresponding enzymes have also 
45 been described in A. rhizogenes, A. vitis (Canaday J et al. 

(1992) Mol Gen Genet 235:292-303) and Pseudomonas savastanoi 
(Yamada et al. (1985) Proc Natl Acad Sci USA 82:6522-6526). 
The use as a negative selection marker for destroying partic- 
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ular cell tissues (e.g. pollen? US 5,426,041) or transgenic 
plants (US 5,180,873) has been described. Corresponding se- 
quences and the process of carrying out negative selection 
processes using, for example, naphthaleneacetamide are known 
to the skilled worker (see above). The sequences, materials 
and processes disclosed in the context of said publications 
are hereby explicitly referred to. 



Particular preference is given to sequences according to the 
GenBank Acc. No: M61151, AF039169 and AB025110. Preference is 
also given to nucleic acid sequences coding for polypeptides 
according to SEQ ID NO: 28, 30, 32, 34 or 36, in particular 
the sequences according to SEQ ID NO: 27, 29, 31, 33 or 35. 



(j) adenine phosphor ibosyl transferases (APRT), with preference 
being given to using as nontoxic substance X substances such 
as 4-aminopyrazolopyriniidine. Corresponding sequences and the 
process of carrying out negative selection processes with use 
are known to the skilled worker (Wigler M et al - (197 9) Proc 
Natl Acad Sci USA 76 ( 3 ) : 1373-6 ; Taylor et al. -The APRT Sys- 
tem", pp., 311-332, M. Gottesman (ed.). Molecular Cell Ge- 
netics, John Wiley and Sons, New York, 1985). 



k) methoxinine dehydrogenases, with preference being given to 
using as nontoxic substance X substances such as 2- 
amino-4-methoxybutanoic acid (methoxinine) which is converted 
into the toxic methoxyvinyl glycine (Margraff R et al . (1980) 
Experimentia 36: 846). 

1) rhizobitoxin synthases, with preference being given to using 
as nontoxic substance X substances such as 2-amino-4-methoxy- 
butanoic acid (methoxinine) which is converted into the toxic 
2-amino-4- [ 2-amino-3-hydroxypropyl ] -trans-3-butanoic acid 
(rhizobitoxin) (Owens I*D et al. (1973) Weed Science 
21:63-66), 

m) 5-methylthioribose (MTR) kinases, with preference being given 
to using as nontoxic substance X substances such as 5-(tri- 
fluoromethyl)thioribose (MTR analog, "subversive substrate") 
which is converted, via an unstable intermediate, into the 
toxic substance (Y) carbothionyl difluoride. The MTR kinase 
is a key enzyme of the methionine salvage pathway. Corre- 
sponding enzyme activities have been described in plants, 
bacteria and protozoa but not in mammals. MTR kinases of var- 
ious species have been identified owing to defined sequence 
motifs (Sekowska A et al. (2001) BMC Microbiol 1:15; 
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http : / /www . biomedcentr al . com/ 1471-2180/1/15). Corresponding 
sequences and the process of carrying out negative selection 
processes using, for example, 5-( trif luoromethyl) thioribose 
are known to the skilled worker and readily obtainable from 
the appropriate sequence database (e.g. GenBank) (Sekowska A 
et al. (2001) BMC Microbiol 1:15; Cornell KA et al. (1996) 
317:285-290). The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
itly referred to. 

However, a plant MTR kinase has not yet been identified unam- 
biguously and is provided within the scope of the process of 
the invention ( SEQ ID NO: 39 and, respectively, 40). In addi- 
tion, homologs from other plant species are provided, namely 
from corn (SEQ ID NO: 59 and, respectively, 60) , oilseed rape 
(SEQ ID NO: 61, 63 and, respectively, 62, 64), rice (SEQ ID 
NO: 65 and, respectively, 66) and soybean (SEQ ID NO: 67 and, 
respectively, 68) . 

Accordingly, the invention further relates to amino acid se- 
quences encoding a plant 5 -methyl thioribose kinase, wherein 
said amino acid sequence contains at least one sequence se- 
lected from the group consisting of SEQ ID NO: 60, 62, 64, 66 
or 68. 

Accordingly, the invention further relates to nucleic acid 
sequences encoding a plant 5-methylthioribose kinase, wherein 
said nucleic acid sequence contains at least one sequence se- 
lected from the group consisting of SEQ ID NO: 59, 61, 63, 65 
or 67 . Even if said sequences are in parts only fragments of 
complete cDNAs, their length is nevertheless more than suffi- 
cient in order to ensure a use and functionality as antisense 
rna or double-stranded EN A. Preference is given to using as 
marker protein a plant endogenous MTR kinase. Further endoge- 
nous plant MTR kinases may readily be identified by means of 
screening databases or gene libraries using conserved, MTK 
kinase-typical motifs. Said motifs may be derived from Fig. 
9a-b, for example. Such motifs may comprise, by way of exam- 
ple but not by limitation, the following sequences: 



E(V/I)GDGN(L/I)N(L/Y/F)V(F/Y) , 

KQALPY(V/I)RC 

SWPMT ( R/K ) ERAYF 

PEVYHFDRT 

GMRY(I/L)EPPHI 

CRLTEQWFSDPY 

HGDLH(S/T)GS 



preferably EVGDGNIiN ( Y/F ) V(F/Y ) 
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Further suitable motifs may be derived from Fig. 9a-b without 
difficulty. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: AF212863 or AC079674 (Protein_ID=AAG51775 . 1 ) . 
preference is also given to nucleic acid sequences coding for 
polypeptides according to SEQ ID NO: 38 or 40, in particular 
the sequences according to SEQ ID NO: 37 or 39. 

alcohol dehydrogenases (Adh), in particular plant Adh-1 gene 
products, with preference being given to using as nontoxic 
substance X substances such as allyl alcohol which is con- 
verted in this manner into the toxic substance (Y) acrolein. 
Corresponding sequences and the process of carrying out nega- 
tive selection processes using, for example, allyl alcohol 
are known to the skilled worker and readily obtainable from 
the appropriate sequence database (e.g. GenBank) (Wisman E et 
al. (1991) Mol Gen Genet 226 ( 1-2 ): 120-8 ; Jacobs M et al. 
(1988) Biochem Genet 26(1-2) :105-22; Schwartz D. (1981) Envi- 
ron Health Perspect 37:75-7). The sequences, materials and 
processes disclosed in the context of said publications are 
hereby explicitly referred to. 

particular preference is given to sequences according to Gen- 
Bank Acc. No: X77943, M12196, AF172282, X04049 or AF253472. 
Preference is also given to nucleic acid sequences coding for 
polypeptides according to SEQ ID NO: 42, 44, 46 or 48, in 
particular the sequences according to SEQ ID NO: 41, 43, 45 
or 47 . 



[o) Further suitable negative selection markers are those se- 
quences which exert per se a toxic action on plant cells, 
such as, for example, diphtheria toxin A, ribonucleases such 
as barnase and also ribosome-inhibiting proteins such as ri- 
cin. In this context, these proteins are preferably expressed 
in the plant cells inducibly rather than constitutively . The 
induction is preferably carried out chemically, it being pos- 
sible, for example, to use the chemically inducible promoters 
mentioned below in order to ensure said chemically induced 
expression. 

"Reduction" or "to reduce" is to be interpreted broadly in con- 
nection with a marker protein or with its amount, expression, ac- 
tivity and/or function and comprises the partial or essentially 
complete stopping or blocking, based on different cell-biological 
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mechanisms, of the functionality of a marker protein in a plant 
cell, plant or a part, tissue, organ, cells or seeds derived 
therefrom. 

5 A reduction for the purpose of the invention also comprises a re- 
duction of the amount of a marker protein down to an essentially 
complete lack of said marker protein (i.e. a lack of detectabil- 
ity of marker protein activity or marker protein function or a 
lack of immunological detectability of said marker protein), in 

10 this context, expression of a particular marker protein (or of 

its amount, expression, activity and/or function) in a cell or an 
organism is reduced preferably by more than 50%, particularly 
preferably by more than 80%, very particularly preferably by more 
than 90%, most preferably by more than 98%. Reduction means in 

15 particular also the complete lack of the marker protein (or of 
its amount, expression, activity and/or function). In this con- 
text, activity and/or function mean preferably the property of 
the marker protein of exerting a toxic effect on the plant cell 
or the plant organism and, respectively, the ability to convert 

20 the substance X into the substance Y . The toxic effect caused by 
the marker protein is reduced preferably by more than 50%, par- 
ticularly preferably by more than 80%, very particularly prefer- 
ably by more than 9 0%, most preferably by more than 98%. "Reduc- 
tion" includes of course within the scope of the present 

25 invention also a complete, 100% reduction or removal of the mark- 
er protein (or of its amount, expression, activity and/or func- 
tion) (for example by deleting the marker protein gene from the 
genome) . 



30 



35 



The invention comprises various strategies for reducing the ex- 
pression, amount, activity and/or function of the marker protein, 
The skilled worker appreciates the fact that a number of various 
methods are available in order to influence the expression, 
amount, activity and/or function of a marker protein in the de- 
sired way. Examples which may be mentioned but which are not lim- 
iting are: 



40 



a) introducing at least one marker protein double-stranded ribo- 
nucleic acid sequence (MP-dsRNA) or an expression cassette or 
expression cassettes ensuring expression thereof. Included 
are those processes in which the MP-dsRNA is directed against 
a marker protein gene (i.e. genomic DNA sequences such as 
promoter sequences) or a marker protein gene transcript (i.e. 
45 mRNA sequences). 



20 
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b) introducing at least one marker protein antisense ribonucleic 
acid sequence (MP-antisenseRNA) or an expression cassette en- 
suring expression thereof. Included are those processes in 
which the MP-antisenseRNA is directed against a marker pro- 

5 tein gene (i.e. genomic DNA sequences) or a marker protein 

gene transcript (i.e. RNA sequences) . ct-anomeric nucleic acid 
sequences are also included. 

c) introducing at least one MP-antisenseRNA combined with a ri- 
10 bozyme or an expression cassette ensuring expression thereof 

d) introducing at least one marker protein sense ribonucleic 
acid sequence (MP-senseRNA) for inducing a cosuppression or 
an expression cassette ensuring expression thereof 

1 5 

e) introducing at least one DNA- or protein-binding factor 

against a marker protein gene, marker protein RNA or marker 
protein or an expression cassette ensuring expression thereof 

f ) introducing at least one viral nucleic acid sequence causing 
degradation of the marker protein RNA or an expression cas- 
sette ensuring expression thereof 

g) introducing at least one construct for generating a function- 
al loss (e.g. generation of stop codons , shifts in the read- 
ing frame etc.) on a marker protein gene, for example by gen- 
erating an insertion, deletion, inversion or mutation in a 
marker protein gene. Preferably, knockout mutants may be gen- 

30 erated by means of targeted insertion into said marker pro- 

tein gene via homologous recombination or by introducing se- 
quence-specific nucleases against marker protein gene 
sequences . 

It is known to the skilled worker that it is also possible to use 
other processes within the scope of the present invention in or- 
der to reduce a marker protein or its activity or function. For 
example, it may also be advantageous, depending on the type of 
the marker protein used, to introduce a dominant-negative variant 
of a marker protein or an expression cassette ensuring expression 
thereof. In this context, any single one of these processes may 
cause- a reduction in the expression, amount, activity and/or 
function of a marker protein. A combined application is also con- 
ceivable. Further methods are known to the skilled worker and may 
comprise hindering or stopping the processing of the marker pro- 
tein, the transport of the marker protein or of its mRNA, the in- 
hibition of ribosome attachment, the inhibition of RNA splicing, 
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the induction of an enzyme degrading marker protein RNA and/or 
the inhibition of translational elongation or termination. 

The embodiments below will describe by way of example the indi- 
5 vidual preferred processes: 

a) Introducing a double-stranded ribonucleic acid sequence of a 
marker protein (MP-dsRNA) 

10 

The process of gene regulation by means of double-stranded RNA 
(''double-stranded RNA interference"; dsRNAi) has been described 
many times for animal and plant organisms (e.g. Matzke MA et al. 
(2000) Plant Mol Biol 43:401-415; Fire A. et al (1998) Nature 

15 391:806-811; WO 99/32619; WO 99/53050; WO 00/68374; WO 00/44914; 
WO 00/44895; WO 00/49035; WO 00/63364). The processes and methods 
described in the references indicated are hereby explicitly re- 
ferred to. dsRNAi processes are based on the phenomenon that si- 
multaneously introducing the complementary strand and contour 

20 strand of a gene transcript suppresses expression of the corre- 
sponding gene in a highly efficient manner. Preferably, the phe- 
notype caused is very similar to that of a corresponding knockout 
mutant (Waterhouse PM et al. (1998) Proc Natl Acad Sci USA 
95:13959-64). The dsRNAi process has proved to be particularly 

25 efficient and advantageous in reducing marker protein expression. 

Double-stranded RNA molecule means within the scope of the inven- 
tion preferably one or more ribonucleic acid sequences which, ow- 
ing to complementary sequences, are theoretically (e.g. according 

30 to the base pair rules by Watson and Crick) and/or actually (e.g. 
owing to hybridization experiments in vitro and/or in vivo) capa- 
ble of forming double-stranded RNA structures. The skilled worker 
is aware of the fact that the formation of double-stranded RNA 
structures represents a state of equilibrium. Preferably, the ra- 

35 tio of double-stranded molecules to corresponding dissociated 
forms is at least 1 to 10, preferably 1:1, particularly prefer- 
ably 5:1, most preferably 10:1. 

The invention therefore further relates to double-stranded RNA 
40 molecules (dsRNA-Molekule) which, when introduced into a plant 
organism (or into a cell, tissue, organ or propagation material 
derived therefrom) cause the reduction of at least one marker 
protein. The double-stranded RNA molecule for reducing expression 
of a marker protein (MP-dsRNA) here preferably comprises 

45 

a) a "sense" RNA strand comprising at least one ribonucleotide 
sequence which is essentially identical to at least a part of 
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the u sense" RNA transcript of a nucleic acid sequence coding 
for a marker protein/ and 

b) an "antisense" RNA strand which is essentially, preferably 
5 fully, complementary to the RNA sense strand under a) . 

With respect to the dsRNA molecules, marker protein nucleic acid 
sequence preferably means a sequence according to SEQ ID NO: 1, 
3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 
10 37, 39, 41/ 43* 45 or 47 or a functional equivalent thereof. 



15 



20 



"Essentially identical" means that the dsRNA sequence may also 
have insertions, deletions and also individual point mutations in 
comparison with the marker protein target sequence and neverthe- 
less causes an efficient reduction in expression- The homology 
(as defined hereinbelow) between the "sense" strand of an inhibi- 
tory dsRNA and at least one part of the "sense" RNA transcript of 
a nucleic acid sequence coding for a market protein (or between 
the "antisense" strand of the complementary strand of a nucleic 
acid sequence coding for a marker protein) is preferably at least 
75%, preferably at least 80%, very particularly preferably at 
least 90%, most preferably 100%. 

A 100% sequence identity between dsRNA and a marker protein gene 
transcript is not absolutely necessary in order to cause an effi- 
cient reduction in marker protein expression. Consequently, the 
process is advantageously tolerant toward sequence deviations as 
may be present due to genetic mutations, polymorphisms or evolu- 
tionary divergences. Thus it is possible, for example, using the 
dsRNA which has been generated starting from the marker protein 
sequence of the first organism, to suppress marker protein ex- 
pression in a second organism. This is particularly advantageous 
when the marker protein used is a plant-intrinsic, endogenous 
marker protein (for example a 5-methylthioribose kinase or alco- 
hol dehydrogenase). For this purpose, the dsRNA preferably in- 
cludes sequence regions of marker protein gene transcripts which 
correspond to conserved regions. Said conserved regions may be 
readily derived from sequence comparisons . 

The length of the subsection is at least 10 bases, preferably at 
least 25 bases, particularly preferably at least 50 bases, very 
particularly preferably at least 100 bases, most preferably at 
least 20 0 bases or at least 300 bases. 

Alternatively, an "essentially identical" dsRNA may also be de- 
fined as a nucleic acid sequence capable of hybridizing with part 
of a marker protein gene transcript (e.g. in 400 mM NaCl, 40 mM 
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PIPES pH 6.4, 1 mM EDTA at 50°C or 70°C for 12 to 16 h) . 



10 



"Essentially complementary" means that the "antisense" RNA strand 
may also have insertions , deletions and also individual point 
mutations in comparison with the complement of this "sense" RNA 
strand. The homology between the "antisense" RNA strand and the 
complement of the "sense" RNA strand is preferably at least 80% , 
preferably at least 90% , very particularly preferably at least 
95%, most preferably 100%. 



"Part of the "sense" RNA transcript" of a nucleic acid sequence 
coding for a marker protein means fragments of an RNA or mRNA 
transcribed or transcribable from a nucleic acid sequence coding 

15 for a marker protein , preferably from a marker protein gene. In 
this context, the fragments have a sequence length of preferably 
at least 20 bases, preferably at least 50 bases, particularly 
preferably at least 100 bases, very particularly preferably at 
least 200 bases, most preferably at least 500 bases. The complete 

20 transcribable RNA or mRNA is also included. Included are also se- 
quences such as those which may be transcribed under artificial 
conditions from regions of a marker protein gene which are other- 
wise, under natural conditions, not transcribed, such as promoter 
regions, for example. 

25 

The dsRNA may consist of one or more strands of polyribonucleo- 
tides. Naturally, in order to achieve the same purpose, it is 
also possible to introduce a plurality of individual dsRNA mole- 
cules which comprise in each case one of the above-defined ribo- 

30 nucleotide sequence sections into the cell or the organism. The 
double-stranded dsRNA structure may be formed starting from two 
complementary, separate RNA strands or, preferably, starting from 
a single, self -complementary RNA strand. In this case, the 
"sense" RNA strand and the "antisense" RNA strand are preferably 

35 connected covalently to one another in the form of an inverted 



40 



45 



As described in WO 99/53050, for example, the dsRNA may also com- 
prise a hairpin structure by connecting the "sense" and the "an- 
tisense" strands by a connecting sequence ("linker"; for example 
an intron). Preference is given to the self -complementary dsRNA 
structures, since they require only the expression of an RNA se- 
quence and always comprise the complementary RNA strands in an 
equimolar ratio. The connecting sequence may is preferably an in- 
tron (e.g. an intron of the potato ST-LS1 gene? Vancanneyt GF et 
al. (1990) Mol Gen Genet 220 ( 2 ) : 24 5-250 ) . 
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The nucleic acid sequence coding for a dsRNA may include further 
elements such as, for example, transcription termination signals 
or polyadenylation signals. 

5 Bringing together, if intended, the two strands of the dsRNA in a 
cell or plant may be achieved by way of example in the following 
way: 

a) transformation of the cell or plant with a vector comprising 
both expression cassettes, 



15 



b) cotransformation of the cell or plant with two vectors, one 
of which comprises the expression cassettes containing the 
"sense" strand and the other one of which comprises the ex- 
pression cassettes containing the "antisense" strand. 

The formation of the RNA duplex may be initiated either outside 
or inside the cell. 



20 



The dsRNA may be synthesized either in vivo or in vitro. For this 
purpose, a DNA sequence coding for a dsRNA may be inserted into 
an expression cassette under the control of at least one genetic 
control element (such as a promoter, for example). A polyadenyla- 
25 tion is not necessary and neither need any elements for initiat- 
ing a translation be present. Preference is given to the expres- 
sion cassette for the MP-dsRNA being present on the 
transformation construct or the transformation vector. For this 
purpose, the expression cassettes coding for the "antisense" 
strand and/or the "sense" strand of an MP-dsRNA or for the self- 
complementary strand of the dsRNA are preferably inserted into a 
transformation vector and introduced into the plant cell by using 
the processes described below. A stable insertion into the genome 
may be advantageous for the process of the invention but is not 
absolutely necessary. Since a dsRNA causes a long-terra effect, 
transient expression is also sufficient in many cases. The dsRNA 
may also be part of the RNA to be expressed by the nucleic acid 
sequence to be inserted by fusing it, for example, to the 3 '-un- 
translated part of said RNA. 



30 
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40 



45 



The dsRNA may be introduced in an amount which makes possible at 
least one copy per cell. Higher amounts (e.g. at least 5 r 10, 
100, 500 or 1000 copies per cell) may, if appropriate, cause a 
more efficient reduction. 

b) Introducing an antisense ribonucleic acid sequence of a mark- 
er protein (MP-antisenseRNA) 
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Processes for reducing a particular protein by means of the "an- 
tisense" technique have been described multiple times f also in 
plants (Sheehy et al. (1988) Proc Natl Acad Sci USA 85: 
8805-8809; US 4,801,340; Mol JN et al. (1990) FEBS Lett 

5 268(2) :427-430) . The antisense nucleic acid molecule hybridizes 
or binds to the cellular mRNA and/or genomic DNA coding for the 
marker protein to be reduced, thereby suppressing transcription 
and/or translation of said marker protein. The hybridization may 
be produced in a conventional manner via the formation of a 

q stable duplex or, in the case of genomic DNA, by binding of the 
antisense nucleic acid molecule to the duplex of the genomic DNA 
via specific interaction in the large groove of the DNA helix. 



An MP-antisenseRNA may be derived using the nucleic acid sequence 
15 coding for this marker protein, for example the nucleic acid se- 
quence according to SEQ ID NOr 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43 r 45 or 47 accord- 
ing to the base pair rules by Watson and Crick. The MP-antisen- 
seRNA may be complementary to the entire transcribed mRNA of the 
2 0 marker protein, may be limited to the coding region or may con- 
sist only of an oligonucleotide which is complementary to a part 
of the coding or noncoding sequence of the mRNA. Thus, for exam- 
ple, the oligonucleotide may be complementary to the region com- 
prising the translation start site for the marker protein. The 
MP-antisenseRNA may be, for example, 5, 10, 15, 20, 25, 30, 35, 
25 40, 4 5 or 50 nucleotides in length, but may also be longer and 
comprise at least 100, 200, 500, 1000, 2000 or 5000 nucleotides. 
MP-antisenseRNA are preferably expressed recombinantly in the 
target cell in the course of the process of the invention. 

The MP-antisenseRNA may also be part of an RNA to be expressed by 
the nucleic acid sequence to be inserted by being fused, for ex- 
ample, to the 3 ' -untranslated part of said RNA. 



40 



35 The invention further relates to transgenic expression cassettes 
containing a nucleic acid sequence coding for at least part of a 
marker protein, with said nucleic acid sequence being functional- 
ly linked in antisense orientation to a promoter functional in 
plant organisms. Said expression cassettes may be part of a 
transformation construct or transformation vector or else may be 
introduced in the course of a cotrans formation. 

In a further preferred embodiment, expression of a marker protein 
may be inhibited by nucleotide sequences which are complementary 
45 to the regulatory region of a marker protein gene (e.g. a marker 
protein promoter and/or enhancer) and which form with the DNA 
double helix there triple-helical structures, thereby reducing 
transcription of the marker protein gene. Corresponding processes 
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have been described (Helene C (1991) Anticancer Drug Res 
6(6):569-84; Helene C et al . (1992) Ann NY Acad Sci 660:27-36; 
Maher LJ (1992) Bioassays 14 ( 12 ): 807-815 ) . 

5 in a further embodiment, the MP~antisenseRNA may be an a-anomeric 
nucleic acid- Such a-anomeric nucleic acid molecules form with 
complementary RNA specific double-stranded hybrids in which r in 
contrast to the conventional ft-nucleic acids, the two strands are 
oriented parallel to one another (Gautier C et al. (1987) Nucleic 
10 Acids Res 15:6625-6641). 

c) Introducing an MP-antisenseRNA combined with a ribozyme 

Advantageously, the above -de scribed antisense strategy may be 

15 coupled to a ribozyme process. Catalytic RNA molecules or ribo- 
zymes may be adapted to any target RNA and cleave the phospho- 
diester backbone in specific positions, thereby functionally de- 
activating said target RNA (Tanner NK (1999) FEMS Microbiol Rev 
23(3) t257-275) . In the process, the ribozyme is not modified it- 

20 self but is capable of cleaving in an analogous manner further 
target RNA molecules, thereby acquiring the properties of an en- 
zyme. The incorporation of ribozyme sequences into "antisense" 
RNAs imparts specifically to these "antisense" RKAs this enzyme- 
like, RNA-cleaving property and thus increases their efficiency 

25 in inactivating the target RNA. The preparation and use of ap- 
propriate ribozyme "antisense" RNA molecules have been described 
(inter alia in Haseloff et al. (1988) Nature 334: 585-591); Ha- 
selhoff and Gerlach (1988) Nature 334:585-591; Steinecke P et al. 
(1992) EMBO J 11(4):1525- 1530; de Feyter R et al. (1996) Mol Gen 

30 Genet. 250 ( 3 ): 329-338 ) . 

In this way r it is possible to use ribozymes (e.g. hammerhead ri- 
bozymes; Haselhoff and Gerlach (1988) Nature 334:585-591) in or- 
der to catalytically cleave the mRNA of a marker protein to be 

35 reduced and thus prevent translation. The ribozyme technique may 
increase the efficiency of an antisense strategy. Processes for 
expressing ribozymes in order to reduce particular proteins have 
been described in (EP 0 291 533, EP 0 321 201, EP 0 360 257). Ri- 
bozyme expression has likewise been described in plant cells 

4 0 (Steinecke P et al. (1992) EMBO J 11 ( 4 ): 1525-153 0; de Feyter R 

et al. (1996) Mol Gen Genet. 250 (3 ): 329-338 ) . Suitable target se- 
quences and ribozymes may be determined, for example, as de- 
scribed in "Steinecke P, Ribozymes, Methods in Cell Biology 50, 
Galbraith et al. eds. Academic Press, Inc. (1995), pp. 449-460", 

45 by calculating the secondary structures of ribozyme RNA and tar- 
get RNA and by the interaction thereof (Bayley CC et al. (1992) 
Plant Mol Biol. 18 ( 2 ) s 353-361 ; Lloyd AM and Davis RW et al . 
(1994) Mol Gen Genet. 242 ( 6 ): 653-657 ) . It is possible, for exam- 
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pie, to construct derivatives of the Tetrahymena L-19 IVS RNA 
which have regions complementary to the raRNA of the marker pro- 
tein to be suppressed (see also US 4,987,071 and US 5,1 16,742). 
Alternatively, such ribozymes may also be identified via a selec- 
5 tion process from a library of various ribozymes (Bartel D and 
Szostak JW (1993) Science 261:1411-1418). 



10 



15 



20 



25 



d) Introducing a sense ribonucleic acid sequence of a marker 
protein (MP— senseRNA) for inducing a cosuppression 

Expression of a marker protein ribonucleic acid sequence (or a 
part thereof) in sense orientation may result in a cosuppression 
of the corresponding marker protein gene. Expression of sense RNA 
with homology to an endogenous marker protein gene may reduce or 
switch off expression of the latter, as has been described simi- 
larly for antisense approaches (Jorgensen et al . (1996) Plant Mol 
Biol 31(5) :957-973; Goring et al. (1991) Proc Natl Acad Sci USA 
88:1770-1774; Smith et al. (1990) Mol Gen Genet 224:447-481; Na- 
poli et al. (1990) Plant Cell 2:279-289; Van der Krol et al . 
(1990) Plant Cell 2:291-99), In this context, the introduced 
construct may represent completely or only partially the homolo- 
gous gene to be reduced. The possibility of translation is not 
required. The application of this technique to plants has been 
described (e.g. Napoli et al . (1990) Plant Cell 2:279-289; in 
US 5 f 034 r 323. 



The cosuppression is preferably carried out using a sequence 
which is essentially identical to at least part of the nucleic 
3Q acid sequence coding for a marker protein , for example the nucle- 
ic acid sequence according to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 
15 r 17, 19, 21 r 23, 25, 27, 29 ,31, 33, 35, 37, 39, 41, 43, 45 or 
47. 

35 The MP -senseRNA is preferably chosen in such a way that a 

translation of the marker protein or a part thereof cannot occur. 
For this purpose, for example, the 5 ' -untranslated or 3 '-untrans- 
lated region may be chosen or else the ATG start codon may be de- 
leted or mutated. 

40 

e) introducing DNA- or protein-binding factors against marker 
protein genes, marker protein RNAs or proteins 

45 Marker protein expression may also be reduced using specific DNA- 
binding factors, for example factors of the zinc finger tran- 
scription factor type. These factors attach to the genomic se- 
quence of the endogenous target gene, preferably in the 



PF 53790 



CA 02493364 2005-01-21 



30 

regulatory regions, and cause a reduction in expression. Ap- 
propriate processes for preparing corresponding factors have been 
described (Dreier B et al. (2001) J Biol Chem 276 ( 31 ): 29466-78 ; 
Dreier B et al. (2000) J Mol Biol 303 ( 4 ): 489-502 ; Beerli RR et 
5 al- (2000) Proc Natl Acad Sci USA 97 ( 4 ): 1495-1500 ; Beerli RR et 
al. (2000) J Biol Chem 275 ( 42 ) :32617-32627 ; Segal DJ and Barbas 
CF 3rd. (2000) Curr Opin Chem Biol 4(1) '.34-39; Kang JS and Kim JS 
(2000) J Biol Chem 275 ( 12 ): 8742-8748 ; Beerli RR. et al . (1998) 
Proc Natl Acad Sci USA 95( 25 ): 14628- 14633; Kim JS et al. (1997) 
10 Proc Natl Acad Sci USA 94(8):3616 -3620; Klug A (1999) J Mol Biol 
293(2) :215-218; Tsai SY et al . (1998) Adv Drug Deliv Rev 
30(1-3) :23-31; Mapp AK et al. (2000) Proc Natl Acad Sci USA 
97(8) : 3930-3935; Sharrocks AD et al . (1997) Int J Biochem Cell 
Biol 29(12) :1371-1387; Zhang L et al . (2000) J Biol Chem 

„ 275(43) :33850-33860) . 
15 

These factors may be selected using any segment of a marker pro- 
tein gene. This section is preferably in the region of the pro- 
moter region. However r for gene suppression, it may also be in 
20 the region of the coding exons or introns. 

It is also possible to introduce factors which inhibit the marker 
protein itself into a cell. These protein-binding factors may be, 
for example, aptamers (Famulok M and Mayer G (1999) Curr Top Mi- 
25 crobiol Immunol 243:123-36) or antibodies or antibody fragments 
or single-chain antibodies. Obtaining these factors has been de- 
scribed (Owen M et al. (1992) Biotechnology (N Y) 10 ( 7 ) :790-794 ; 
Franken E et al- (1997) Curr Opin Biotechnol 8 ( 4 ): 411-416; White- 
lam (1996) Trend Plant Sci 1:286-272). 
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f ) introducing viral nucleic acid sequences and expression 
constructs causing the degradation of marker protein RNA 

Marker protein expression may also be effectively implemented by 
inducing the specific degradation of marker protein RNA by the 
plant with the aid of a viral expression system (Amplikon; Angell 
SM et al. (1999) Plant J 20 ( 3 ): 357-362 ) . These systems, also re- 
ferred to as "VICS" (viral induced gene silencing) , introduce nu- 
cleic acid sequences with homology to the transcript of a marker 
protein to be reduced into the plant by means of viral vectors. 
Transcription is then switched off, presumably mediated by plant 
defence mechanisms against viruses. Appropriate techniques and 
processes have been described (Ratcliff F et al. (2001) Plant J 
25(2) : 237-45; Fagard M und Vaucheret H (2000) Plant Mol Biol 
43(2-3) =285-93; Anandalakshmi R et al. (1998) Proc Natl Acad Sci 
USA 95(22) :13079-84; Ruiz MT (1998) Plant Cell 10( 6 ): 937-46 ) . 
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VTGS-mediated reduction is preferably implemented using a se- 
quence which is essentially identical to at least part of the nu- 
cleic acid sequence coding for a marker protein, for example the 
nucleic acid sequence according to SEQ ID NO: 1, 3, 5, 7, 9, 11, 
5 13, 15, 17 f 19, 21, 23, 25, 27, 29, 31, 33, 35 f 37, 39, 41, 43, 
45 or 47. 



10 



g) Introducing constructs for generating a functional loss or a 
functional reduction of marker protein genes 



The skilled worker knows numerous possible processes of how to 
modify genomic sequences in a targeted manner. These include, in 
particular, processes such as the generation of knockout mutants 
25 by means of targeted homologous recombination, for example by 
generating stop codons, shifts in the reading frame etc. (Hohn B 
and Puchta H (1999) Proc Natl Acad Sci USA 96:8321-8323) or the 
targeted deletion or inversion of sequences by means of, for ex- 
ample, sequence-specific recombinases or nucleases (see below) - 

20 

In a preferred embodiment, the marker protein gene is inactivated 
by introducing a sequence-specific recombinase . Thus it is pos- 
sible, for example, for the marker protein gene to include recog- 
nition sequences for sequence-specific recombinases or to be 
25 flanked by such sequences, and introducing the recombinase then 
deletes or inverts particular sequences of the marker protein 
gene, thus leading to inactivation of the marker protein gene. A 
corresponding procedure is depicted diagrammatically in Fig. 1. 
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Appropriate processes for deletion/ inversion of sequences by 
means of sequence— specif ic recombinase systems are known to the 
skilled worker- Examples which may be mentioned axe the Cre/lox 
system of bacteriophage Pi (Dale EC and Ow DW (1991) Proc Natl 
Acad Sci USA 88:10558-10562; Russell SH et al. (1992) Mol Gen Ge- 
net 234:49-59; Osborne BI et al. (1995) Plant J 7:687-701), the 
yeast FLP/FRT system (Kilby NJ et al. (1995) Plant J 8:637-652; 
Ly2nik LA et al . (1996) Nucl Acids Res 24:3784-3789), the Gin re- 
combinase of the Mu phage, the E. coli Pin recombinase and the 
R/RS system of the pSRl plasmids (Onouchi H et al.(1995) Mol Gen 
Genet 247:653-660; Sugita Ket al . (2000) Plant J. 22:461-469). In 
these systems, the recombinase (for example Cre or EXP) interacts 
specifically with its particular recombination sequences (34 bp 
lox-Sequenz and, respectively, 47 bp FRT sequence) . Preference is 
given to the bacteriophage Pi Cre/lox and the yeast FLP/FRT sys- 
tems. The FLP/FRT and cre/lox recombinase systems have already 
been applied in plant systems (Odell et al. (1990) Mol Gen Genet 
223:369-378). Preference is given to introducing the recombinase 
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by means of recombinant expression starting from an expression 
cassette included on a DNA construct. 



The activity or amount of the marker protein may also be reduced 
by a targeted deletion in the marker protein gene, for example by 
sequence-specific induction of DNA double-strand breaks at a rec- 
ognition sequence for specific induction of DNA double-strand 
breaks in or close to the nucleic acid sequence coding for a 
marker protein. In its simplest embodiment (cf. Fig. 2, A and B) 
an enzyme is to this end introduced with the transformation 
construct, which generates at least one double-strand break in 
such a way that the resulting illegitimate recombination or dele- 
tion causes a reduction in the activity or amount of marker pro- 
tein, for example by inducing a shift in the reading frame or 
deletion of essential sequences. 



The efficiency of this approach may be increased by the sequence 
coding for the marker protein being flanked by sequences (A and, 

20 respectively, A' ) which have a sufficient length and homology to 
one another in order to recombine with one another as a conse- 
quence of the induced double-strand break and thus to cause, due 
to an intramolecular homologous recombination, a deletion of the 
sequence coding for the marker protein. Fig. 3 depicts diagram- 

25 matically a corresponding procedure in an exemplary embodiment of 
this variant. 
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The amount, function and/or activity of the marker protein may 
also be reduced by a targeted insertion of nucleic acid sequences 
(for example of the nucleic acid sequence to be inserted within 
the scope of the process of the invention) into the sequence cod- 
ing for a marker protein (e.g. by means of intermolecular homolo- 
gous recombination) . This embodiment of the process of the inven- 
tion is particularly advantageous and preferred, since, in 
addition to the general advantages of the process of the inven- 
tion, it makes it moreover also possible to insert the nucleic 
acid sequence to be inserted into the plant genome in a reproduc- 
ible, predictable, location-specific manner. This avoids the 
positional effects which otherwise occur in the course of a ran- 
dom, location-unspecif ic insertion (and which may manifest them- 
selves, for example, in the form of different levels of expres- 
sion of the transgene or in unintended inactivation of endogenous 
genes). Preference is given to using as an "anti-marker protein" 
compound in the course of this embodiment a DNA construct which 
comprises at least part of the sequence of a marker protein gene 
or neighbouring sequences and which can thus specifically recom- 
bine with said sequences in the target cell so that a deletion, 
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addition or substitution of at least one nucleotide alters the 
marker protein gene in such a way that the functionality of said 
marker protein gene is reduced or completely removed- The alter- 
ation may also affect the regulatory elents (e.g. the promoter) 
5 of the marker protein gene so that the coding sequence remains 
unaltered, but expression (transcription and/or translation) does 
not occur and is reduced. In conventional homologous recombina- 
tion, the sequence to be inserted is flanked at its 5 r and/or 3' 
end by further nucleic acid sequences (A' and, respectively , B') 

10 which have a sufficient length and homology to corresponding se- 
quences of the marker protein gene (A and, respectively, B) for 
making homologous recombination possible. The length is usually 
in a range from several hundred bases to several kilobases (Thom- 
as KR and Capecchi MR (1987) Cell 51:503; Strepp et al . (1998) 

15 Proc Natl Acad Sci USA 95 ( 8 ): 4368-4373 ) . The homologous recom- 
bination is carried out by transforming the plant cell containing 
the recombination construct by using the process described below 
and selecting successfully recombined clones based on the subse- 
quently inactivated marker protein. Although homologous recom- 

20 bination is a relatively rare event in plant organisms, a selec- 
tion pressure may be avoided by recombination into the marker 
protein gene, allowing a selection of the recombined cells and 
sufficient efficiency of the process. Fig. 4 diagrammatical ly de- 
picts a corresponding procedure in an exemplary embodiment of 

25 this variant. 



In an advantageous embodiment of the invention, however, inser- 
tion into the marker protein gene is facilitated by means of fur- 
ther functional elements. The terra is to be understood as being 
comprehensive and means the use of sequences or of transcripts or 
polypeptides derived therefrom which are capable of increasing 
the efficiency of the specific integration into a marker protein 
gene. Various processes are available to the skilled worker for 
this purpose. However, preference is given to implementing the 
insertion by inducing a sequence-specific double-strand break in 
or close to the marker protein gene. 



40 



In a preferred embodiment of the invention, the marker protein is 
inactivated (i.e. the amount, expression, activity or function is 
reduced) by integrating a DNA sequence into a marker protein 
gene, with the process preferably comprising the following steps: 



i) introducing an insertion construct and at least one enzyme 
45 suitable for inducing DNA doublets trand breaks at a recogni- 

tion sequence for targeted induction of DNA double-strand 
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breaks in or close to -the marker protein gene, and 

ii) inducing DNA double-strand breaks at the recognition se- 
quences for targeted induction of DNA double-strand breaks in 

5 or close to the marker protein gene, and 

iii) inserting the insertion construct into the marker protein 
gene, with the functionality of the marker protein gene and, 
preferably, the functionality of the recognition sequence for 
targeted induction of DNA double-strand breaks is inactivated 
so that the enzyme suitable for induction of DNA double- 
strand breaks can no longer cut said recognition sequence, 
and 

15 

iv) selecting plants or plant cells in which the insertion 
construct has been inserted into the marker protein gene . 

The insertion construct, preferably, comprises the nucleic acid 
20 sequence to be inserted into the genome but may also be used sep- 
arately therefrom, 

"Enzyme suitable for inducing DNA double-strand breaks at the 
recognition sequence for targeted induction of DNA double-strand 
breaks" ("DSBI enzyme" for " double-strand -break inducing enzyme " 
hereinbelow) means generally all those enzymes which are capable 
of generating sequence-spec if ically double-strand breaks in 
double-stranded DNA. Examples which may be mentioned but which 
are not limiting are: . 

30 

1. Restriction endonucleases, preferably type II restriction en- 
donucleases, particularly preferably Homing endonucleases as 
described in detail hereinbelow. 

35 2. Artificial nucleases as described in detail hereinbelow, such 
as, for example, chimeric nucleases, mutated restriction or 
Homing endonucleases or RNA protein particles derived from 
group II mobile introns . 

40 Both natural and artificially prepared DSBI enzymes are suitable. 
Preference is given to all of those DSBI enzymes whose recogni- 
tion sequence is known and which can either be obtianed in the 
form of their proteins (for example by purification) or be ex- 
pressed using their nucleic acid sequence. 



45 



Preference is given to selecting the DSBI enzyme, with the knowl- 
edge of its specific recognition sequence, in such a way that it 



/ 
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possesses, apart from the target recognition sequencer no further 
functional recognition regions in the genome of the target plant. 
Very particular preference is therefore given to Homing endonu- 
cleases (overview: Belfort M and Roberts RJ (1997) Nucleic Acids 
5 Res 25:3379-3388; Jasin M (1996) Trends Genet 12:224-228? Inter- 
net: http://rebase-neb.coni/rebase/rebase.homing.html; Roberts RJ 
and Macelis D (2001) Nucl Acids Res 29: 268-269). The latter ful- 
fill said requirement, owing to their long recognition sequences. 
The sequences coding for Homing endonucleases of this kind may be 

10 isolated, for example, from the Chlamydomonas chromoplast genome 
(Turmel M et al. ( 1993) J Mol Biol 232:446-467). Suitable Homing 
endonucleases are listed under the abovementioned internet ad- 
dress. Examples of Homing endonucleases which may be mentioned 
are those like F-Scel, F-Scell, F-Suvl, F-TevI, F-TevII, I-Amal, 

15 I-Anil, I-Ceul, I-CeuAIIP , I-Chul, l-CmoeI r I-Cpal, I-Cpall, 
I-Crel, 1-CrepsbIP, 1-CrepsbllP , I-CrepsblllP , 1-CrepsbIVP, 
I-Csml, I-Cvul, I-CvuAIP, I-Ddill, I-Dirl, I-Dmol, 1-HspNIP, 
I-Llal, I-Msol, I-Naal, I-NanI, I-NclIP, I-NgrIP, I-Nitl r I-Njal, 
I-Nsp236IP, I-PakI, I-PboIP f I-PcuIP, I-PcuAI, I-PcuVI, I-PgrIP, 

20 I-PobIP f I-Porl, I-PorIIP r I-PpbIP, I-Ppol, l-SPBetalP r I-Scal, 
I-Scel, I-SceII, I-SceIII , I-SceIV, I-SceV, I-SceVI, I-SceVII, 
I-SexIP, I-SneIP, I-SpomCP, I-SpomIP, I-SpomIIP f I-SquIP r I- 
Ssp6803l r I-SthPhiJP, I-SthPhiST3P, !-SthPhiS3bP f I-TdeIP, 
I-TevI, I-TevII, I-TevIII, I-UarAP r I-UarHGPAlP, I-UarHGPA13P r 

25 I-VinIP f I-ZbiIP, PI-MtuI, PI-MtuHIP, PI-MtuHIIP, Pl-Pful, PI- 
Pfull, Pl-Pkol, Pl-Pkoir, Pl-Pspl, PI-Rma4 3812IP, PI-SPBetaXP r 
Fl-Scel, PI-Tful, PI-TfuII, PI-Thyl, PI-Tlil, PI-TliXI. Prefer- 
ence is given here to those Homing endonucleases whose gene se- 
quences are already known, such as, for example, F-Scel, I-Ceul, 

30 I-Chul, I-Dmol, I-Cpal, I-Cpall, I-Crel, I-CsmI, F-Tevi, F-TevII, 
I-TevI, I-TevII, I-Anil, I-Cvul, I-Llal, I-Nanl, I-Msol, I-Nitl, 
I-Njal, I-PakI, I-Porl, I-Ppol, I-Scal, I-Ssp6803l, Pl-Pkol, PI- 
PkoII, PI-PspI, PI-Tful, PI-Tlil. 



35 



Very particular preference is given to 



40 



I-Ceul (Cote MJ and Turmel M (1995) Curr Genet 27:177-183.; 
Gauthier A et al. (1991) Curr Genet 19:43-47; Marshall (1991) 
Gene 104:241-245; GenBank Acc. No.: 217234 nucleotides 5102 
to 5758), 



I-Chul (Cote V et al.(1993) Gene 129:69-76; GenBank Acc. No.: 
L06107, nucleotides 419 to 1075), 
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I-Cmoel (Drouin M et al. (2000) Nucl Acids Res 28:4566- 
4572) , 

I-Cpal from Chlamydomonas pal lido stlgmatica (GenBank Acc. 
No. i L36830, nucleotides 357 to B15; Turmel M et al. (1995) 
Nucleic Acids Res 23:2519-2525; Turmel, M et al. (1995) 
Mol Biol Evol 12:533-545) 

I-Cpall (Turmel M et al. (1995) Mol Biol Evol 12:533-545; 
GenBank Acc. No.: 1.39865, nucleotides 719 to 1423), 

I-Crel (Wang J et al. (1997) Nucleic Acids Res 25: 3767-3776? 
DQrrenberger, F and Rochaix JD (1991) EMBO J 10:3495-3501; 
GenBank Acc. No.: X01977, nucleotides 571 to 1062), 

I-CsmI (Ma DP et al. (1992) Plant Mol Biol 18:1001-1004) 

I-Nanl (Elde M et al. (1999) Eur J Biochem. 259:281-288; Gen- 
Bank Acc. No.: X78280, nucleotides 418 to 1155), 

I-NitI (GenBank Acc, No.: X78277, nucleotides 426 to 1163), 

I-Njal (GenBank Acc* No.: X78279, nucleotides 416 to 1153), 

I-Ppol (Muscarella DE and Vogt VM (1989) Cell 56:443-454; 
Lin J and Vogt VM (1998) Mol Cell Biol 18:5809-5817; GenBank 
Acc. No.: M38131, nucleotides 86 to 577), 

I-Pspl (GenBank Acc. No.: U00707, nucleotides 1839 to 3449), 

I-Scal (Monteilhet C et al. (2000) Nucleic Acids Res 28: 
1245-1251; GenBank Acc. No.: X95974, nucleotides 55 to 465) 

I-Scel ("WO 96/14408; US 5,962,327, therein Seq ID NO: 1), 

Endo Seel (Kawasaki et al. (1991) J Biol Chem 266:5342-5347, 
identical to F-Scel; GenBank Acc. No.: M63839, nucleotides 
159 to 1589), 

I-Scell (Sarguiel B et al. (1990) Nucleic Acids Res 
18:5659-5665), 

I-ScelII (Sarguiel B et al- (1991) Mol Gen Genet. 
255:340-341) , 
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I-Ssp6803I (GenBank Acc. No.: D64003, nucleotides 35372 to 
35824) , 



I-TevI (Chu et al. (1990) Proc Natl Acad Sci USA 
87:3574-3578; Bell-Pedersen et al. (1990) Nucleic Acids 
Resl8:3763-3770; GenBank Acc. No.: AF158101, nucleotides 
144431 to 143694) r 

I-Tevll (Bell-Pedersen et al* (1990) Nucleic Acids Res 
18:3763-3770; GenBank Acc. No.: AF158101, nucleotides 45612 
to 44836), 



15 



20 



30 



I-TevIII (Eddy et al . (1991) Genes Dev. 5:1032-1041). 

Very particular preference is given to commercially available 
Homing endonucleases such as I-Ceul, I-Scel, I-Ppol, PI-PspI or 
Pl-Scel. Most preference is given to I-Scel and I-Ppol. While the 
gene coding for I-Ppol may be utilized in its natural form, the 
gene coding for I-Scel possesses an editing site. Since, in con- 
trast to yeast mitochondria, the appropriate editing is not car- 
ried out in higher plants, an artificial sequence encoding the 
I-Scel protein must be used for heterologous expression of this 
enzyme (OS 5,866,361). 

The enzymes may be purified from their source organisms in the 
manner familiar to the skilled worker and/or the nucleic acid se- 
quence encoding said enzymes may be cloned. The sequences of var- 
ious enzymes have been deposited with GenBank (see above). 



Artificial DSBI enzymes which may be mentioned by way of example 
are chimeric nucleases which are composed of an unspecific nu- 
clease domain and a sequence-specific DNA-binding domain (e.g. 
consisting of zinc fingers) (Smith J et al. (2000) Nucl Acids 
Res 28(17) :3361-3369; Bibikova M et al. (2001) Mol Cell Biol. 
21:289-297). Thus, for example, the catalytic domain of the re- 
striction endonuclease Fokl has been fused to zinc finger -binding 
domains, thereby defining the specificity of the endonuclease 
(Chandrasegaran S & Smith J (1999) Biol Chem 380:841-848? Kim YG 
& Chandrasegaran S (1994) Proc Natl Acad Sci USA 91:883-887; Kim 
YG et al. (1996) Proc Natl Acad Sci USA 93:1156-1160). The de- 
scribed technique has also been used previously for imparting a 
predefined specificity to the catalytic domain of the yeast Ho 
endonuclease by fusing said domain to the zinc finger domain of 
transcription factors (Nahon E & Raveh D (1998) Nucl Acids Res 
26:1233-1239). It is possible, using suitable mutation and selec- 
tion processes, to adapt existing Homing endonucleases to any de- 
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sired recognition sequence. 



10 



20 



25 



As mentioned, zinc finger proteins are particularly suitable as 
DNA-binding domains within chimeric nucleases- These DNA-binding 
zinc finger domains may be adapted to any DNA sequence. Appropri- 
ate processes for preparing corresponding zinc finger domains 
have been described and are known to the skilled worker (Beerli 
RR et al. (2000) Proc Natl Acad Sci 97 ( 4 ): 1495-1500 ; Beerli KR 
et al.(2000) J Biol Chem 275(42 ) :32617-32627 ; Segal DJ and Barbas 
CF 3rd. (2000) Curr Opin Chem Biol 4(l):34-39; Kang JS and Kim JS 
(2000) J Biol Chem 275 ( 12 ): 8742-8748 ; Beerli RR et al . (1998) 
Proc Natl Acad Sci USA 95 ( 25 ): 14628-14 633 ; Kim JS et al. (1997) 
Proc Natl Acad Sci USA 94 ( 8 ): 36 16-3620 ; Klug A (1999) J Mol Biol 
293 (2) : 2 15-2 18; Tsai SY et al. (1998) Adv Drug Deliv Rev 
30(1-3) :23-31? Mapp AK et al. (2000) Proc Natl Acad Sci USA 
97(8) : 3930-3935; Sharrocks AD et al. (1997) Int J Biochem Cell 
Biol 29(12) :1371-1387; Zhang L et al. (2000) J Biol Chem 
275(43) : 33850-33860) . Processes for preparing and selecting zinc 
finger DNA-binding domains with high sequence specificity have 
been described (WO 96/06166, WO 98/53059, WO 98/53057). Fusing a 
DNA-binding domain obtained in this way to the catalytic domain 
of an endonuclease (such as , for example, the Fokl or Ho endonu- 
clease) enables chimeric nucleases to be prepared which have any 
desired specificity and which may be used as DSBI enzymes advan- 
tageously within the scope of the present invention. 



Artificial DSBI enzymes with altered sequence specificity may 
also be generated by mutating already known restriction endonu- 

30 cleases or Homing endonucleases, using methods familiar to the 
skilled worker. Besides the mutagenesis of Homing endonucleases, 
the mutagenesis of maturases is of particular interest for the 
purpose of obtaining an altered substrate specificity. Maturases 
frequently share many features with Homing endonucleases and, if 

35 appropriate, can be converted into nucleases by carrying out few 
mutations. This has been shown, for example, for the maturase in 
the bakers' yeast bi2 intron. Only two mutations in the maturase- 
encoding open reading frame (ORP) sufficed to impart to this en- 
zyme a Homing-endonuclease activity (Szczepanek & Lazowska (1996) 

40 EMBO J 15:3758-3767). 



Further artificial nucleases may be generated with the aid of mo- 
bile group II introns and the proteins encoded by them, or parts 
of these proteins. Mobile group II introns, together with the 
proteins encoded by them, form RNA-protein particles which are 
capable of recognizing and cutting DNA in a sequence-specific 
manner. In this context, the sequence specificity can be adapted 
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to the requirements by mutating particular regions of the intron 
(see below) (WO 97/10362). 



Preference is given to expressing the DSBI enzyme as a fusion 
5 protein with a nuclear localization sequence (NLS) • This NLS se- 
quence enables facilitated transport into the nucleus and in- 
creases the efficiency of the recombination system- Various NLS 
sequences are known to the skilled worker and described, inter 
alia, in Jicks GR and Raikhel NV (1995) Annu. Rev. Cell Biol. 
10 11:155-188. For example, the NLS sequence of the SV40 large anti- 
gen is preferred for plant organisms, very particular preference 
is given to the following NLS sequences: 

NLS1 : N-Pro-Lys-Thr-Lys-Arg-Lys-Val-C 

15 

NLS 2 : N-Pro-Lys-Lys-Lys-Arg-Lys-Val-C 

Owing to the small size of many DSBI enzymes (such as, for exam- 
20 pie, the Homing endonucleases ) , an NLS sequence is not absolutely 
necessary, however. These enzymes are able to pass through the 
nuclear pores also without this assistance. 



"Recognition sequence for targeted induction of DNA double-strand 
25 breaks" means in general those sequences which allow recognition 
and cleavage by the DSBI enzyme under the conditions in the euka- 
ryotic cell or organism used in this case. In this context, men- 
tion is made, by way of example but not by limitation, in table 1 
below of the recognition sequences for the particular DSBI en- 
20 zymes listed. 



Table 1 : Recognition sequences and source organisms of DSBI 

enzymes ( indicates the cleavage site of the DSBI 
enzyme within a recognition sequence) 

35 
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DSBI 
enzyme 


Source 
organism 


Recognition sequence 


CRE 


Bacteriophage 
PI 


5 ' -AACTCTCATCGCTTCGGATAACTTCCTGTTATCCGAAACAT 
ATCACTCACTTTGGTGATTTCACCGTAACTGTCTATGATTAATG 
-3' 


FLP 


Saccharomyces 
cerevisiae 


5 ' -GAAGTTCCTATTCCGAAGTTCCTATTCTCTAGAAAGTA- 
TAGGAACTTC— 3 ' 


R 


pSRl 

plasmids 


5 ' -CGAGATCATATCACTGTGGACGTTGATGAAAGAATACGTTA 
TTCTTTCATCAAATCGT 
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P-element j 

transpo- 

sase 


Drosophila j 


5'- 


CTAGATGAAATAACATAAGGTGG 




5 


I -Anil J 


Aspergillus j 
nidulans 


5 ' -TTGAGGAGGTT ^ TCTCTGTAAATAANNNNNNNmmNNNNN 
3 ' - AACTCC T CC AAAGAGACATT T ATTNNNNNNN1TNNNNNNN * 




I-Ddil 


Dictyosteliuro 1 
discoideumAX3 


5'- 


-TTTTTTGGTCATCCAGAAGTATAT 


— 


10 


I-Cvul 


Chlorella I 
vulgaris 1 


5'- 
3'- 


-CTGGGTTCAAAACGTCGTGA A GACAGTTTGG 
-GACCCAAGTTTTGCAG^CACTCTGTCAAACC 






I-CsmI 1 


Chlamydomonas | 
siai-thii 1 


5'- 


-GTACTAGCATGGGGTCAAATGTCTTTCTGG 




15 


I-Cmoel j 


Chlamydomonas 1 
moewusii 


5 

3'- 


-TCGTAGCAGCT ~ CACGGTT 
-AGCATCG A TC GAGTGC C AA 




20 


I-Crel 


Chlamydomonas I 
reinhardtii 


5' 
3 


-CTGGGTTCAAAACGTCGTGA^ GACAGTTTGG 
-GACCCAAGTTTTGCAG^CACTCTGTCAAACC 




I-Chul 


Chlamydomonas 
humicola 


5 ' 
3 ' 


-GAAGGTTTGGCACCTCG^ATGTCGGCTCATC 
-CTTCC AAACCGTG A GAGCTACAGCCG AGT AG 

* 


— 


25 


I-Cpal 


| Chlamydomonas 
pallidostig- 
matica 


| 5 ' 
3' 


-CGATCCTAAGGTAGCGAA * ATTC A 
-GCTAGGATTCCATC A GCTTTAAGT 


— 




I-CpaII 


1 Chlamydomonas 
pallidostig- 
matica 


5 ' 
3' 


-CCCGGCTAACTC ~ TGTGCCAG 
-GGGCCGAT ~ TGAGACACGGTC 





30 


I-Ceul 


| Chlamydomonas 
t eugametog 


5 ' 
3' 


— CGTAACTATAACGGTCCTAA'* GGTAGCGAA 
- GCATTGAT ATTGCC AG * GATTCCATCGCTT 





35 


1 I-Dmol 


| Desulf urococ- 
cus mobilis 


5' 
3' 


- ATGCCTTGCCGGGTAA * GTTCCGGCGCGCAT 

- TACGGAACGGCC ~ CATTCAAGGCCGCGCGTA 






I-Scel 


j S . cerevisiae 


\ S ' 
3 r 


-AGTT ACGCTAGGGATAA" CAGGGTAAT AT AG 
-TCAATGCGATCCC ~ TATTGTCCCATT AT ATC 
5 ' -TAGGGATAA" CAGGGTAAT 
3 '-ATCCC A TATTGTCCCATTA ("Core" sequE 


snce) j 


40 


I-SceII 


1 S • cerevisiae 


5' 
3' 


-TTTTGATTCTTTGGTCACCC A TGAAGTATA 
-AAAACT AAGAAACCAG m TGGGACTTCATAT 






I-SceIII 


5 .cerevisiae 


5 r 
3 ' 


-ATTGGAGGTTTTGGTAAC ~TATTTATTACC 
- TAACCTCC AAAACC ~ ATTGAT AAATAATGG 




45 


I-SceIV 


S - cerevisiae 


5 r 
3' 


-TCTTTTCTCTTGATTA^GCCCTAATCTACG 
-AGAAAAGAGAAC~ TAATCGGGATTAGATGC 
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I-SceVI 



41 



S . cerevisiae 



S . cerevisiae 



- AATAATTTTCT * TCTTAGTAATGCC 

- TTATTAAAAGAAGAATCATTA~ CGG 



-GTT ATTTAATG ~ TTTTAGT AGTTGG 
-CAATAAATTAC AAAATCATC A " ACC 



I-SceVII 



Pl-Scel 



10 



S - cerevisiae 



-TGTCACATTGAGGTGCACTAGTTATTAC 



S - cerevisiae 



- ATCTAT GTCGGGTGC " GGAGAAAGAGGTAAT 
-TAGATACAGCC "CACGCCTCTTTCTCCATTA 



F-Scel 



S •cerevisiae 



-GATGCTGTAGGC ~ ATAGGCTTGGTT 
-CTACGACA~TCCGTATCCGAACCAA 



F-Scell 



15 



S . cerevisiae 



-CTTTCCGCAAC A A GTAAAATT 
-G AAAGGCG " TTGTC ATTTTAA 



I -Hum I 



20 



I-HmuII 



I-Llal 



Bacillus sub- 
fcilis bacte- 
riophage SPOl 



-AGTAATGAGCCTAACGCTCAGCAA 
-TCATTACTCGGATTGC ~ GAGTCGTT 



Bacillus 
subtilis 
bacteriophage 
SP82 



5 ' -AGTAATGAGCCTAACGCTCAACAANNNNNNNNOTJNNNNNNN 
NNNNNNNNNNNNNNNNNNNNNN 



Lactococcus 
lactis 



5 
3 



- CACATC C ATAAC A C AT ATCAT TTTT 

- GTGTAGGTATTGGTAT AGTAA " AAA 



I-Msol 



I -Iff an I 



30 



Monoiuastix 
species 



Naegleria 
andersoni 



-CTGGGTTC AAAACGTCGTGA A GACAGTTTGG 
-GACCCAAGTTTTGCAG ~C ACTCTGTCAAACC 



— AAGTCTGGTGCCA" GCACCCGC 
-TTCAGACC * ACGGTCGTGGGCG 



I-NitI 



Naegleria 
italica 



-AAGTCTGGTGCCA "* GCACCCGC 
-TTCAGACC ^ACGGTCGTGGGCG 



35 



I-Njal 



Naegleria 
jarniesoni 



—AAGTCTGGTGCCA~ GCACCCGC 
-TTCAGACC * ACGGTCGTGGGCG 



I-PakI 



Pseudendoclo- 
nium akinetum 



—CTGGGTTCAAAACGTCGTGA~ GACAGTTTGG 
-GACCC AAGTTTTGCAG ^ CACTCTGTC AAACC 



40 



I-Porl 



I-Ppol 



45 I I-Scal 



Pyrobaculum 
organotrophum 



5 

3 



-GCGAGCCCGTAAGGGT * GTGTACGGG 
-CGCTCGGGCATT * CCCACACATGCCC 



Physarum 
polycephalum 



5 
3 



-TAACTATGACTCTCTTAA~ GGTAGCCAAAT 
-ATTGATACTGAGAG~AATTCCATCGGTTTA 



Saccharomyces 
capensis 



5 
3 



- TGT C ACATTGAGGTGCACT " AGT T ATT AC 

- AC AGTGTAACTCCAC * GTGATC AATAATG 



t 



t 
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I~Ssp6803l 


Synechocystis 
species 


5 ' — GTCGGGCT * CAT AACCCGAA 
3 ' -CAGCCCGAGTA^TTGGGCTT 


PI-Prul 


Pyrococcus 
furlosus Vcl 


D ^GAAGATGGGAvjGAG&G accggactcaactt 
3 ' — CTTCTACCCTCC * TCCCTGGCCTGAGTTGAA 


Pl-PfuII 


Pyrococcus 
xurxosus Vcl 


5 ' -ACGAATCCATGTGGAGA^AGAGCCTCTATA 
3 ' -TGCTTAGGTACAC CTCTTCTCGGAGATAT 


Pl-Pkol 


Pyrococcus 
kodakar aens i s 
KOD1 


5 '-GATTTTAGAT^CCCTGTACC 
3 r — CTAAAA^ TCTAGGGACATGG 


PI-Pko3I 


Pyrococcus 
kodakar aens i s 
KOD1 


5 ' — CAGTACTACG~ GTTAC 
3 * -GTCATG "AT GCCAATG 


PI-PspI 


Pyrococcus 
sp. 


5 * -AAAATCCTGGCAAACAGCTATTAT "GGGTAT 
3 * -TTTTAGGACCGTTTGTCGAT ** AATACCCATA 


PI-Tful 


Thermococcus 

fumicolans 

ST557 


5 ' -TAGATTTTAGGT'CGCTATATCCTTCC 
3 ' -ATCTAAAA* TCCAGCGATATAGGAAGG 


PI-TfuII 


Thermococcus 

fumicolans 

ST557 


5 ' -TAYGCNGAYACTT GACGGYTTYT 
3 ' -ATRCGNCT~RTGNCTGCCRAARA 


PI-Thyl 


Thermococcus 
hydro-thermal - 
is 


5 ' -TAYGCNGAYACN * GACGGYTTYT 
3 ' -ATRCGNCT* RTGNCTGCCRAARA 


PI-Tlil 


Thermococcus 
litoralis 


5 ' -TAYGCNGAYACNGACGG'YTTYT 
3 ' -ATRCGNCTRTGNC ~ TGCCRAARA 


PI-Tlill 


Thermococcus 
litoralis 


5 ' -AAATTGCTTGCAAACAGCTATTACGGCTAT 


I-TevI 


Bacteriophage 
T4 


5 ' -AGTGGTATCAAC~GCTCAGTAGATG 
3 ' -TCACCATAGT ^TGCGAGTCATCTAC 


I-TevII 


Bacteriophage 
T4 


5 ' -GCTTATGAGTATGAAGTGAACACGT^TATTC 
3 ' — CGAATACTC ATACTTCACTTGTG * C AATAAG 


F-TevI 


Bacteriophage 
T4 


5 ' - GAAAC AC AAGA A AATGT T TAG TAAANNNHNNNNNNNNNN 
3 ' -CTTTGTGTTCTTTACAAATCATTTNNNNNNNNNNNNNN^ 


F-TevII 


Bacteriophage 
T4 


5 ' — TTTAATCCTCGCTTC "AGATATGGCAACTG 
3 ' — AAATTAGGAGCGA"AGTCTATACCGTTGAC 



10 



15 



20 



30 



40 



Relatively small deviations (degenerations) of the recognition 
sequence which nevertheless make possible recognition and cleav- 
age by the particular DSBI enzyme are also included here. Such 
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deviations, also in connection with different basic conditions 
such as 7 for example, calcium or magnesium concentration, have 
been described (Argast GM et al. (1998) J Mol Biol 280:345-353). 
Core sequences of these recognition sequences are also included. 
5 It is known that the inner portions of the recognition sequences 
also suffice for an induced double-strand break and that the out- 
er portions are not necessarily relevant but may contribute to 
determining the cleavage efficiency. Thus, for example, an 18bp 
core sequence can be defined for I-Scel. 

10 

Said DSBI recognition sequences may be localized in various posi- 
tions in or close to a marker protein gene and, for example when 
the marker protein used is a transgene, may already be incorpo- 
rated when constructing the marker protein expression cassette. 
15 Various possible localizations are illustrated by way of example 
in Figs. 2-A, 2-B, 3 and 5 and in the descriptions thereof. 



In a further advantageous embodiment, the insertion sequence com- 
prises at least one homology sequence A which has a sufficient 
length and a sufficient homology to a sequence A' in the marker 
protein gene in order to ensure homologous recombination between 
A and A r . The insertion sequence is preferably flanked by two se- 
quences A and B which have a sufficient length and a sufficient 
homology to a sequence A' and, respectively, B 9 in the marker 
protein gene in order to ensure homologous recombination between 
A and A' and, respectively, B and B'. 



"Sufficient length" means, with respect to the homology sequences 
30 A, A' and B, B', preferably sequences with a length of at least 
100 base pairs, preferably at least 250 base pairs, particularly 
preferably at least 500 base pairs, very particularly preferably 
at least 1000 base pairs, most preferably of at least 2500 base 
pairs . 

35 

"Sufficient homology" means, with respect to the homology se- 
quences, preferably sequences whose homology to one another is at 
least 70%, preferably 80%, preferentially at least 90%, particu- 
larly preferably at least 95%, very particularly preferably at 
40 least 99%, most preferably 100%, over a length of at least 20 
base pairs, preferably at least 50 base pairs, particularly pre- 
ferably at least 100 base pairs, very particularly preferably at 
least 250 base pairs, most preferably at least 500 base pairs. 



Homology between two nucleic acids means the identity of the nu- 
cleic acid sequence over in each case the entire sequence length, 
which identity is calculated by way of comparison with the aid of 
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the GAP program algorithm (Wisconsin Package Version 10 ,0, Uni- 
versity of Wisconsin, Genetics Computer Group ( GCG) , Madison, 
USA), setting the following parameters: 

5 Gap Weights 12 Length Weight: 4 

Average Match: 2,912 Average Mismatch: -2, 003 

10 In a further preferred embodiment, the recombination efficiency 
is increased by a combination with processes which promote homol- 
ogous recombination. Such systems have been described and com- 
prise, by way of example, expression of proteins such as RecA or 
treatment with PARP inhibitors * It has been demonstrated that the 

* 5 int r ac hromos oraa 1 homologous recombination in tobacco plants can 
be increased by using PARP inhibitors (Puchta H et al. (1995) 
Plant J 7:203-210). The use of these inhibitors can further in- 
crease the rate of homologous recombination in the recombinant 
constructs, after inducing the sequence-specific DNA double- 

20 strand break, and thus the efficiency of the deletion of the 
transgene sequences* Various PARP inhibitors may be used here. 
Preference is given to including inhibitors such as 3-amino 
benzamide, 8-hydroxy-2-methylquinazolin-4-one (NU1025) , l,llb-di- 
hydro- [ 2H ] benzopyr ano [4,3, 2-de ] isoquinolin-3-one ( GPI 6150), 

25 5-aminoisoquinolinone, 3 , 4-dihydro-5- [ 4- ( 1-piperidi- 

nyl)butoxy]-l(2H)-isoquinolinone or the substances described in 
WO 00/26192, WO 00/29384, WO 00/32579, WO 00/64878, WO 00/68206, 
WO 00/67734, WO 01/23386 and WO 01/23390. 

30 

Further suitable methods are the introduction of nonsense muta- 
tions into endogenous marker protein genes, for example by means 
of introducing RNA/DNA oligonucleotides into the plant (Zhu 
et al. (2000) Nat Biotechnol 18 ( 5 ): 555-558 ) . Point mutations may 
also be generated by means of DNA-RNA hybrids which are also 
35 known as "chimeraplasty" (Cole-Strauss et al. (1999) Nucl Acids 
Res 27 (5) :1323-1330; Kmiec (1999) Gene therapy American Scientist 
87(3) :240-247) . 



40 



The methods of dsRNAi, cosuppression by means of sense RNA and 
VIGS (virus induced gene silencing) are also referred to as post- 
transcriptional gene silencing (PTGS) . PTGS processes are partic- 
ularly advantageous because the demands on the homology between 
the marker protein gene to be reduced and the transgenically ex- 
pressed sense or dsRNA nucleic acid sequence are lower than, for 
example, in the case of a traditional antisense approach. Thus it 
is possible, using the marker protein nucleic acid sequences from 
one species, to effectively reduce also expression of homologous 
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marker protein proteins in other species, without it being abso- 
lutely necessary to isolate and to elucidate the structure of the 
marker protein homologues occurring there. Considerably less la- 
bor is therefore required, 

5 

"Introduction" comprises within the scope of the invention any 
processes which are suitable for introducing an "anti-marker pro- 
tein" compound, directly or indirectly, into a plant or a cell, 
compartment, tissue, organ or seeds of said plant or generating 
10 said compound there. The introduction may result in a transient 
presence of an "anti-marker protein" compound (for example a 
dsRNA or a recombinase) or else in a permanent (stable) presence. 



15 



20 



According to the different nature of the approaches described 
above, the "anti-marker protein" compound may exert its function 
directly (for example by way of insertion into an endogenous 
marker protein gene). However, said function may also be exexted 
indirectly after transcription into an RNA (for example in anti- 
sense approaches) or after transcription and translation into a 
protein ( for example in the case of recombinases or DSBT en- 
zymes) . The invention comprises both directly and indirectly act- 
ing "anti-marker protein" compounds. 

25 Introducing comprises, for example, processes such as transfec- 
tion, transduction or transformation. 

"Anti-marker protein" compounds thus comprises, for example, also 
expression cassettes capable of implementing expression (i.e. 
30 transcription and, if appropriate, translation) of, for example, 
an MP-dsRNA, an MP-antisenseRNA, a sequence-specific recombinase 
or a DSBI enzyme in a plant cell. 

"Expression cassette" means within the scope of the present in— 
35 vention generally those constructions in which a nucleic acid se- 
quence to be expressed is functionally linked to at least one ge- 
netic control sequence, preferably a promoter sequence. 
Expression cassettes preferably consist of double-stranded DNA 
and may have a linear or circular structure. 



40 



45 



A functional linkage means, for example, the sequential arrange- 
ment of a promoter with a nucleic acid sequence to be transcribed 
(for example coding for an MP-dsRNA or a DSBI enzyme) and, if ap- 
propriate, further regulatory elements such as, for example, a 
terminator and/or polyadenylation signals in such a way that each 
of the regulatory elements can fulfill its function during tran- 
scription of the nucleic acid sequence, depending on the arrange- 
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ment of the nucleic acid sequences. In this context, function can 
mean, for example, the control of expression, i.e. transcription 
and/or translation, of the nucleic acid sequence (e.g. coding for 
an MP-dsRNA or a DSBI enzyme) . In this context, control com- 
5 prises, for example, initiating, increasing , controlling or sup- 
pressing the expression, i.e. transcription and, if appropriate, 
translation. This does not necessarily require a direct linkage 
in the chemical sense. Genetic control sequences such as, for ex- 
ample, enhancer sequences, may exert their function on the target 

10 sequence also from positions further afar or even from different 
DNA molecules. Preference is given to arrangements in which the 
nucleic acid sequence to be transcribed is positioned downstream 
of the sequence acting as promoter so that both sequences are co- 
valently connected to one another. The distance between the pro- 

15 moter sequence and the nucleic acid sequence to be expressed 

transgenically is here preferably less than 200 base pairs, par- 
ticularly preferably less than 100 base pairs, very particularly 
preferably less than 50 base pairs. 



20 



30 



The skilled worker knows various ways of obtaining any of the ex- 
pression cassettes of the invention. An expression cassette of 
the invention is prepared, for example, preferably by direct fu- 
sion of a nucleic acid sequence acting as promoter to a nucleo- 
tide sequence to be expressed (e.g. coding for an MP-dsRNA or a 
DSBI enzyme) • A functional linkage may be produced by means of 
common recombination and cloning techniques, as are described, 
for example, in Maniatis T, Fritsch EF and Sambrook J (198 9) Mo- 
lecular Cloning: A Laboratory Manual, Cold Spring Harbor Labora- 
tory, Cold Spring Harbor, NY and in Silhavy TJ et al. (1984) Ex- 
periments with Gene Fusions, Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY and in Ausubel FM et al-(1987) Current Proto- 
cols in Molecular Biology, Greene Publishing Assoc. and Wiley In- 
terscience . 



35 



The expression cassettes of the invention preferably comprise a 
promoter 5' upstream of the particular nucleic acid sequence to 
be expressed transgenically and a terminator sequence as an addi- 
tional genetic control sequence 3' downstream and also, if ap- 
40 propriate, further customary regulatory elements, in each case 
functionally linked to the nucleic acid sequence to be expressed 
transgenically. 



The term "genetic control sequences" is to be understood broadly 
and means all those sequences which have an influence on the mak- 
ing or function of the expression cassette of the invention. For 
example, genetic control sequences ensure transcription and, if 
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appropriate r translation in prokaryotic or eukaryotic organisms. 
Genetic control sequences are described, for example, in 
"Goeddel; Gene Expression Technology: Methods in Enzymology 185 r 
Academic Press, San Diego, CA (1990)" or "Gruber and Crosby, in: 
5 Methods in Plant Molecular Biology and Biotechnolgy , CRC Press, 
Boca Raton, Florida, eds.:Glick and Thompson, Chapter 7, 89-108" 
and in the references quoted there. 



10 



15 



20 



Genetic control sequences comprise, in particular in plants, 
functional promoters. Preferred promoters suitable for the ex- 
pression cassettes are in principle any promoters capable of con- 
trolling expression of genes, in particular foreign genes, in 
plants . 

Plant-specific promoters or promoters functional in plants or in 
a plant cell means in principle any promoter capable of control- 
ling expression of genes, in particular foreign genes, in at 
least one plant or one part, cell, tissue, culture of a plant. In 
this context, expression may be, for example, constitutive, in- 
ducible or deve lopment -dependent . Preference is given to: 

a) Constitutive promoters 

"Constitutive" promoters means those promoters which ensure 
expression in numerous, preferably all, tissues over a rela- 
tively large period of plant development, preferably at all 
points in time of plant development {Benfey et al.(1989) EMBO 
J 8:2195-2202). Preference is given in particular to using a 
plant promoter or a promoter which is derived from a plant 
virus. Particular preference is given to the promoter of the 
35S transcript of the CaMV cauliflower mosaic virus (Franck 
et al. (1980) Cell 21:285-294? Odell et al. (1985) Kature 
313:810-812; Shewmaker et al . (1985) Virology 140:281-288; 
35 Gardner et al. (1986) Plant Mol Biol 6:221- 228) or the 19S 

CaMV promoter (US 5,352,605; WO 84/02913; Benfey et al. 
(1989) EMBO J 8:2195-2202) and also to the promoter of the 
Arabidopsis thaliana nitrilase-1 gene (GenBank Acc. No.: 
Y07648, nucleotides 2456 (alternatively 2861) to 4308 or al- 
ternatively 4340 or 4344. (e.g. bp 2456 to 4340). 

Another suitable constitutive promoter is the rubisco small 
subunit (SSU) promoter (US 4,962,028), the leguminB promoter 
45 (GenBank Acc. No.: X03677), the promoter of the Agrobacterium 

nopaline synthase, the TR dual promoter, the Agrobacterium 
OCS (octopine synthase) promoter, the ubiquitin promoter 
(Holtorf S et al. (1995) Plant Mol Biol 29:637-649), the ubi- 



30 



40 
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quitin 1 promoter (Christensen et al. (1992) Plant Mol Biol 
18:675-689? Bruce et al. (1989) Proc Natl Acad Sci USA 
86:9692-9696), the Smas promoter, the cinnamyl alcohol dehy- 
drogenase promoter (US 5,683,439), the promoters of the vacu- 
olar ATPase subunits or the promoter of a proline-rich pro- 
tein from vheat (WO 91/13991), and further promoters of genes 
whose constitutive expression in plants is known to the 
skilled worker. 

Tissue-specific promoters 

Preference is given to promoters with specificities for the 
anthers, ovaries, flowers, leaves, stems, roots or seeds. 

Seed-specific promoters comprise, for example, the pro- 
moter of phaseolin (US 5,504,200; Bustos MM et al- (1989) 
Plant Cell 1 ( 9 ): 839-53 ) , of the 2S albumin (Joseffson LG 
et al. (1987) J Biol Chem 262:12196-12201), of legumin 
(Shirsat A et al. (1989) Mol Gen Genet 215(2): 326-331), 
of USP (unknown seed protein; Baumlein H et al. (1991) 
Mol Gen Genet 225 (3 ): 459-67 ) , of napin (US 5,608,152; 
Stalberg K et al. (1996) I* Planta 199:515-519), of the 
sucrose-binding protein (WO 00/26388), of legumin B4 
(LeB4; Baumlein H et al. (1991) Mol Gen Genet 225: 
121-128; Baeumlein et al- (1992) Plant Journal 
2(2): 233-9; Fiedler U et al . (1995) Biotechnology (NY) 
13(10) :1090f) r of oleosin (WO 98/45461) or of Bce4 (WO 
91/13980). Further suitable seed-specific promoters are 
those of the genes coding for the high molecular weight 
glutenin (HMWG) , gliadin, branching enzyme, ADP glucose 
pyrophosphatase (AGPase) or starch synthase- Preference 
is further given to promoters which allow seed-specific 
expression in monocotyledones such as corn, barley, 
wheat, rye, rice, etc. promoters which may be employed 
advantageously are the promoter of the lpt2 or Iptl gene 
(WO 95/15389, WO 95/23230) and the promoters described in 
WO 99/16890 (hordein, glutelin, oryzin, prolamin, glia- 
din, zein, kasirin or secalin promoters)* Further seed- 
specific promoters are described in WO 89/03887. 

Tuber-, storage-root- or root-specific promoters com- 
prise, for example, the class I patatin promoter (B33) or 
the promoter of the potato cathepsin D inhibitor. 

Leaf -spec if ic promoters comprise, for example, the pro- 
moter of the potato cytosolic FBPase (WO 97/05900), the 
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SSU promoter (small subunit) of rubiaco (ribu- 

lose-1 , 5-bisphosphate carboxylase) or the potato ST-LSI 

promoter (Stockhaus et al. (1989) EMBO J 8:2445-2451). 



10 



Flower-specific promoters comprise, for example, the phy- 
toene synthase promoter (WO 92/16635) or the promoter of 
the P-rr gene (WO 98/22593). 

Anther-specific promoters comprise, for example, the 512 6 
promoter (US 5,689,049, US 5,689,051), the glob-1 promot- 
er and the y-zein promoter. 



c) Chemically inducible promoters 

15 

Chemically inducible promoters allow expression control as a 
function of an exogenous stimulus (review article: Gatz et 
al. (1997) Ann Rev Plant Physiol Plant Mol Biol 48:89-108). 
Examples which may be mentioned are: the PRP1 promoter (Ward 

20 et al. (1993) Plant Mol Biol 22:361-366), a salicylic acid- 

inducible promoter (WO 95/19443), a benzenesulfonainide-induc- 
ible promoter (EP-A 0 388 186), a tetracycline-inducible pro- 
moter (Gatz et al. (1992) Plant J 2:397-404), an abscisic 
acid-inducible promoter (EP 0 335 528) and an ethanol- or 

25 cyclohexanone-inducible promoter (WO 93/21334). Also suitable 

is the promoter of the glutathione S-transf erase isoform IX 
gene (GST-II-27), which may be activated by exogenously ap- 
plied safeners such as, for example, N,N-diallyl-2 , 2-dichlo- 
roacetamide (W0 93/01294) and which is functional in numerous 

30 tissues of both monocotyledones and dicotyledones . 



Particular preference is given to constitutive or inducible pro- 
moters . 

Preference is further given to plastid-specif ic promoters for 
targeted expression in the plastids. Suitable promoters are de- 
scribed, for example, in WO 98/55595 or WO 97/06250. promoters 
which may be mentioned here are the rpo B promoter element, the 
atoB promoter element, the clpP promoter element (see also WO 
99/46394) and the 1 6SrDNA promoter element. Viral promoters are 
also suitable (WO 95/16783). 

Targeted expression in plastids may also be achieved by using, 
for example, a bacterial or bacteriophage promoter, introducing 
the resulting expression cassette into the plastid DNA and then 
expressing expression by means of a fusion protein of a bacterial 
or bacteriophage polymerase and a plastid transit peptide. US 
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5,925,806 describes an appropriate process. 

Genetic control sequences further comprise also the 5 '-untrans- 
lated regions, introns or noncoding 3' region of genes, such as, 
5 for example, the actin-1 intron, or the Adhl-S introns 1, 2 and 6 
(general overview: The Maize Handbook, Chapter 116, Freeling and 
Walbot, Eds., Springer, New York (1994)). These sequences have 
been shown to be able to play a significant functions in the reg- 
ulation of gene expression. Thus it has been demonstrated that 
10 5 ' -untranslated sequences may increase transient expression of 
heterologous genes. They may further promote tissue specificity 
(Rouster J et al.(1998) Plant J. 15:435-440). As an example of 
translation enhancers, mention may be made of the 5' leader se- 
quence of the tobacco mosaic virus (Gallie et al. (1987) Nucl 
Acids Res 15:8693-8711). 



15 



Polyadenylation signals suitable as control sequences are in par- 
ticular polyadenylation signals of plant genes and also Agrobac- 

20 terlum tumefaciens T-DNA polyadenylation signals. Examples of 

particularly suitable terminator sequences are the OCS (octopine 
synthase) terminator and the NOS (nopaline synthase) terminator 
(Depicker A et al (1982) J" Mol Appl Genet 1:561-573) and also the 
terminators of soybean actin, RUBISCO or alpha-amylase from wheat 

25 (Baulcombe DC et al (1987) Mol Gen Genet 209:33-40). 

Advantageously, the expression cassette may contain one or more 
"enhancer sequences" functionally linked to the promoter, which 
make increased transgenic expression of the nucleic acid sequence 
30 possible. 

Genetic control sequences further means sequences coding for fu- 
sion proteins consisting of a signal peptide sequence. The ex- 
pression of a target gene is possible in any desired cell 
compartment, such as, for example, the endomembrane system, the 
vacuole and the chloroplasts . Desired glycosylation reactions, in 
particular foldings, and the like are possible by utilizing the 
secretory pathway. Secretion of the target protein to the cell 
surface or secretion into the culture medium, for example when 
using suspension-cultured cells or protoplasts, is also possible. 
The target sequences required for this may both be taken into ac- 
count in individual vector variations and be introduced into the 
vector together with the target gene to be cloned by using a 
suitable cloning strategy. Target sequences which may be used are 
both endogenous r if present, and heterologous sequences. Addi- 
tional heterologous sequences which are preferred for functional 
linkage but not limited thereto are further targeting sequences 



40 



45 
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for ensuring subcellular localization in the apoplast, in the 
vacuole, in plastids, in the mitochrondrion, In the endoplasmic 
reticulum (ER), in the nucleus, in elaioplasts or other compart- 
ments; and also translation enhancers such as the 5' leader se- 
5 quence from tobacco mosaic virus (Gallie et al. (1987) Nucl Acids 
Res 15: 8693-8711) and the like. The process of transporting pro- 
teins which are per se not located in the plastids specifically 
into said plastids has been described (Klosgen RB and Weil JB 
(1991) Mol Gen Genet 225 (2) : 297-304 ? Van Breusegem F et al. 
10 (1998) Plant Mol Biol 38 ( 3 ) : 49 1-496 ) . 



Control sequences are furthermore understood to be those which 
make possible a homologous recombination or insertion into the 
genome of a host organism or allow the removal from the genome. 

15 Methods such as the cre/lox technique allow the expression cas- 
sette to be removed tissue-specifically/ possibly inducibly from 
the genome of the host organism (Sauer B. Methods. 1998; 
14 ( 4 ) s 381-92 ) . Here, particular flanking sequences are attached 
to the target gene (lox sequences), which make subsequent removal 

20 by means of the ere recombinase possible. 



25 



Preferably, the expression cassette, consisting of a linkage of 
the promoter to the nucleic acid sequence to be transcribed, may 
have been integrated into a vector and may be transferred into 
the plant cell or organism, for example, by transformation, ac- 
cording to any of the processes described below. 



''Transgenic" means preferably, for example with respect to a 
30 transgenic expression cassette, a transgenic expression vector, a 
transgenic organism or to processes for transgenic expression of 
nucleic acids, all constructions brought about by genetic engi- 
neering methods or processes using said constructions, in which 
either 



35 



a) the nucleic acid sequence to be expressed, or 



b) the promoter functionally linked to the nucleic acid sequence 
to be expressed according to a), or 

40 

c) (a) and (b) 



are not located in their natural, genetic environment (i.e. at 
their natural chromosomal locus) or have been modified by genetic 
engineering methods, the modification possibly being, for exam- 
ple, a substitution, addition, deletion, inversion or insertion 
of one or more nucleotide residues. Natural genetic environment 
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means the natural chromosomal locus in the source organism or the 
presence in a genomic library. 

"Transgenic" means f with respect to expression ("transgenic ex- 
pression"), preferably all expressions achieved using a transgen- 
ic expression cassette, transgenic expression vector or transgen- 
ic organism, according to the definitions indicated above. 



The DNA constructs employed within the scope of the process of 
the invention and the vectors derived therefrom may contain fur- 
ther functional elements. The term functional element is to be 
understood broadly and means all of those elements which influ- 
ence the preparation, propagation or function of the DNA 
constructs or of vectors or organisms derived therefrom. Examples 
which may be mentioned without being limited thereto are: 



1. Selection markers 



20 Selection markers comprise, for example, those nucleic acid or 

protein sequences whose expression gives to a cell, tissue or or- 
ganism an advantage (positive selection marker) or disadvantage 
(negative selection marker) over cells which do not express said 
nucleic acid or protein. Positive selection markers act, for ex- 

25 ample, by detoxifying a substance acting on the cell in an inhib- 
itory manner (e.g. resistance to antibiotics/herbicides) or by 
forming a substance which enables the plant to regenerate better 
or grow more under the chosen conditions ( for example nutritive 
markers, hormone-producing markers such as ipt; see below). 

30 Another type of positive selection mareker comprises mutated pro- 
teins or RNAs which are not sensitive to a selective agent (e.g. 
16S rRNA mutants which are insensitive to spectinomycin) . Nega- 
tive selection markers act, for example, by catalyzing the forma- 
tion of a toxic substance in the transformed cells (e.g. the codA 

35 gene) • 



1.1 Positive selection markers: 



In order to further increase the efficiency, the DNA constructs 
may comprise additional positive selection markers. In a pre- 
ferred embodiment, the process of the invention may thus be car- 
ried out in the form of a dual selection in which a sequence cod- 
ing for a resistance to at least one toxin, antibiotic or 
herbicide is introduced together with the nucleic acid sequence 
to be inserted and selection is carried out additionally by using 
the toxin, antibiotic or herbicide. 
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Appropriate proteins and sequences of positive selection markers 
and also selection processes are familiar to the skilled worker. 
The selection marker imparts to the successfully transformed 
cells a resistance to a biocide (e.g. a herbicide such as phos- 
5 phinothricin, glyphosate or bromoxynil) , a metabolism inhibitor 
such as 2-deoxyglucose 6-phosphate (WO 98/45456) or an antibiotic 
such as, for example, tetracycline, ampicillin, kanamycin, G 418, 
neomycin, bleomycin or hygromycin. Selection markers which may be 
mentioned by way of example are: 

10 

- phosphinothricin acetyltransf erases (PAT) which acetylate the 
free amino group of the glutamine synthase inhibitor phosphi- 
nothricin (PPT) and thus detoxify PPT (de Block et al. (1987) 
EMBO J 6:2513-2518) (also referred to as Bialophos® resist- 
15 ance gene (bar)). Corresponding sequences are known to the 

skilled worker (from Streptomyces hygroscopicus GenBank Acc . 
No.: X17220 and X05822, from Streptomyces viridochromogenes 
GenBank Acc. No.: M 22827 and X65195; US 5,489,520). Further- 
more, synthetic genes have been described for expression in 
plastids. A synthetic PAT gene is described in Becker et al. 
(1994) Plant J 5:299-307. The genes impart a resistance to 
the herbicide Bialaphos or glufosinate and are frequently 
used markers in transgenic plants (Vickers JE et al. (1996) 
Plant Mol Miol Reporter 14:363—368; Thompson CJ et al. (1987) 
EMBO J 6:2519—2523). 



20 



30 



40 



45 



5-enolpyruvylshikimate 3-phosphate synthases (EPSPS) which 
impart a resistance to glyphosate (N- (phosphonomethyl ) 
glycine). The molecular target of the unselective herbicide 
glyphosate is 5-enolpyruvyl-3-phosphoshikimate synthase 
(EPSPS). This enzyme has a key function in the biosynthesis 
of aromatic amino acids in microbes and plants but not in 
mammals (Steinrucken HC et al . (1980) Biochem Biophys Res 
Commun 94:12 07—1212; Levin JG and. Sprinson DB (1964) J Biol 
Chem 239:1142-1150; Cole DJ (1985) Mode of action of glypho- 
sate a literature analysis, p. 48—74. In: Grossbard E and At- 
kinson D (eds.). The herbicide glyphosate. Buttersworths , 
Boston.). Preference is given to using glyphosate-tolerant. 
EPSPS variants as selection markers (Padgette SR et al* 
(1996). New weed control opportunities: development of soy- 
beans with a Roundup Ready 1 * 1 gene. In: Herbicide Resistant 
Crops (Duke, S.O., ed.), pp. 53-84. CRC Press, Boca Raton/ 
FL; Saroha MK and Malik VS (1998) J Plant Biochemistry and 
Biotechnology 7 :65—72). The EPSPS gene of Agrobacterium sp. 
strain CP4 has a natural tolerance for glyphosate, which can 
be transferred to appropriate transgenic plants. The CP4 
EPSPS gene was cloned from Agrobacterium sp. strain CP4 (Pad- 
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gette SR et al. (1995)Crop Science 35 ( 5 ): 1451-1461) . Se- 
quences of EPSPS enzymes which are glyphosate-tolerant have 
been described (inter alia in US 5 , 510 , 471? US 5 f 776,760; 
US 5,864,425; US 5,633,435; US 5,627;061; US 5,463,175; 
EP 0 218 571) . Further sequences are described under GenBank 
Acc. No: X63374 or M10947. 

Glyphosat®-degrading enzymes (gox gene; glyphosate oxidore- 
ductase). GOX (for example Achromobacter sp. glyphosate oxi- 
doreductase) catalyzes the cleavage of a C-N bond in glypho- 
sate which is thus converted to aminomethylphosphonic acid 
(AMPA) and glyoxylate. GOX can thereby impart a resistance to 
glyphosate (Padgette SR et al- (1996) J Nutr 126 ( 3 ): 702-16 ; 
Shah D et al. (1986) Science 233:478-481). 

The deh gene encodes a dehalogenase which inactivates 
Dalapon® (GenBank Acc. No.: AX022822, AX022820 and 
WO 99/27116) 

The bxn genes encode bromoxynil-degrading nitrilase enzymes 
(Genbank Acc. No: E01313 and J03196). 

Neomycin phosphotransferases impart a resistance to antibiot- 
ics (aminoglycosides) such as neomycin, G418, hygromycin, pa- 
romomycin or kanamycin by reducing the inhibiting action of 
said antibiotics by means of a phosphorylation reaction. Par- 
ticular preference is given to the nptll gene. Sequences can 
be obtained from GenBank (AF080390; AF080389). Moreover, the 
gene is already part of numerous expression vectors and can 
be isolated therefrom using processes familiar to the skilled 
worker (AF234316; AF234315; AF234314). The NPTII gene encodes 
an aminoglycoside 3 ' -O-phosphotransf erase from E.coli, Tn5 

(GenBank Acc. No: U00004 position 1401-2300; Beck et al. 

(1982) Gene 19 327-336). 

The DOG*l gene was isolated from the yeast Sac char omyces cer- 
evisiae (EP-A 0 807 836) and encodes a 2-deoxyglucose 6-phos- 
phate phosphatase which imparts a resistance to 2-DOG 
(Randez-Gil et al. (1995) Yeast 11:1233-1240; Sanz et al. 
(1994) Yeast 10:1195-1202, GenBank Acc. No.: NC001140; posi- 
tion 194799-194056). 

Acetolactate synthases which impart a resistance to imidazo- 
linone/sulf onylurea herbicides (GenBank Acc. No.: X51514; 
Sathasivan K et al. (1990) Nucleic Acids Res. 18(8):2188); 
AB049823; AF094326; X07645; X07644; A19547; A19546; A19545; 
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105376; 105373; AL133315) 

— Hygromycin phosphotransferases (e.g. GenBank Acc . No.: 
X74325) which impart a resistance to the antibiotic hygromy- 

5 cin. The gene is part of numerous expression vectors and may 

be isolated therefrom using processes familiar to the skilled 
worker (such as, for example, polymerase chain reaction) 
(GenBank Acc. No.: AF294981? AF234301; AF2343O0; AF234299; 
AF234298; AF354046; AF354045). 

10 

- Genes of resistance to 



a) Chloramphenicol (chloramphenicol acetyltransf erase ) , 



15 



b) tetracycline (inter alia GenBank Acc* Ho. : X65876; 
X51366). Moreover, the gene is already part of numerous 
expression vectors and may be isolated -therefrom using 
processes familiar to the skilled worker (such as, for 

20 example, polymerase chain reaction) 

c) Streptomycin (inter alia GenBank Acc. No.: AJ278607 ) . 



25 



30 



35 



d) Zeocin, the corresponding resistance gene is part of nu- 
merous cloning vectors (e.g. GenBank Acc. No.: L36849) 
and may be isolated therefrom using processes familiar to 
the skilled worker (such as, for example, polymerase 
chain reaction) . 

e) Ampicillin (B-lactamase gene; Datta N, Richmond MH 
(1966) Biochem J 98(l):204-9; Heffron F et al (1975) 
J. Bacterid 122: 250-256; Bolivar F et al. (1977) 

Gene 2:95—114). The sequence is part of numerous cloning 
vectors and may be isolated therefrom using processes fa- 
miliar to the skilled worker (such as, for example, poly- 
merase chain reaction) . 



Genes such as isopentenyl transferase from Agrobacteriura tumefa- 
40 ciens ( strain:P022 ) (Genbank Acc. No.: AB025109) may also be used 
as selection markers. The ipt gene is a key enzyme of cytokinin 
biosynthesis. Its overexpress ion facilitates the regeneration of 
plants (e.g. selection on cytokinin-f ree medium). The process for 
utilizing the ipt gene has been described (Ebinuma H et al. 
45 (2000) Proc Natl Acad Sci USA 94:2117-2121; Ebinuma H et al. 

(2000) Selection of Marker-free transgenic plants using the onco- 
genes (ipt, rol A, B, C) of Agrobacterium as selectable markers, 
In Molecular Biology of Woody Plants. Kluwer Academic Publish- 
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ers) . 

various other positive selection markers which impart to the 
transformed plants a growth advantage over untrans formed plants 
5 and also processes for their use are described, inter alia, in 
EP-A 0 601 092. Examples which may be mentioned are p-glucuroni- 
dase (in connection with cytokinin glucuronide, for example), 
mannose 6-phosphate isomerase (in connection with mannose), UDP- 
galactose 4-epimerase (in connection with galactose, for exam- 



For a selection marker functional in plastids , particular prefer- 
ence is given to those which impart a resistance to spectinomy- 
15 cin, streptomycin, kanamycin, lincomycin, gentamycin, hygromycin, 
methotrexat, bleomycin, phleomycin, blasticidin, sulfonamide, 
phosphinothricin, chlorsulf uron, bromoxymil, glyphosate, 2,4-da- 
trazine, 4-methyltryptophan, nitrate, S-aminoethyl-L-cysteine , 
lysine/threonine, aminoethyl-cysteine or betainealdehyde . Partic- 
ular preference is given to the genes aadA, nptll, BADH, FLARE-S 
(a fusion of aadA and GFP, described in Khan MS & Maliga P (1999) 
Nature Biotech 17:910-915). Especially suitable is the aadA gene 
(Svab Z and Maliga P (1993) Proc Natl Acad Sci USA 90:913-917). 
Modified 16S rDNA and also betainealdehyde dehydrogenase (BADH) 
from spinach have also been described (Daniell H et al. (2001) 
Trends Plant Science 6:237-239; Daniell H et al. (2001) Curr Ge- 
net 39:109-116; WO 01/64023; WO 01/64024; WO 01/64850). Lethal 
agents such as, for example, glyphosate may also be utilized in 
connection with correspondingly detoxifying or resistance enzymes 
3Q (WO 01/81605). 

The concentrations of the antibiotics, herbicides, biocides or 
toxins, which are used in each case for selection, must be adapt- 
ed to the particular test conditions or organisms. Examples which 
35 may be mentioned for plants are kanamycin (Km) 50 mg/L, hygromy- 
cin B 40 mg/L, phosphinothricin (Ppt) 6 mg/L, spectinomycin 
(Spec) 500 mg/L. 



2 . Reporter genes 

40 



Reporter genes code for readily quantifiable proteins and thus 
ensure, via intrinsic color or enzyme activity, an evaluation of 
the transformation efficiency and of the location or time of ex- 
45 pression. In this context, very particular preference is given to 
genes coding for reporter proteins (see also Schenborn E, Grosk- 
reutz D (1999) Mol Biotechnol 13(l):29-44) such as 
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green fluorescence protein (GFP) (Chui WL et al. (1996) Curr 
Biol 6:325-330; Leffel SM et al. (1997) Biotechniques 
23(5):912-8; Sheen et al. (1995) Plant J 8 ( 5 ) r 777-784 ; 
Haseloff et al. (1997) Proc Natl Acad Sci USA 94(6): 
2122-2127; Reichel et al. (1996) Proc Natl Acad Sci USA 
93 ( 12 ): 5888-5893; Tian et al - (1997) Plant Cell Rep 
16:267-271; WO 97/41228) 

chloramphenicol transferase 

lucif erase (Millar et al. (1992) Plant Mol Biol Rep 10: 
324-414; Ow et al. (1986) Science 234:856-859); allows biolu- 
minescence detection 

P— galactosidase (encodes an enzyme for which various chromo- 
genic substrates are available) 

B-glucuronidase (GUS) (Jefferson et al. (1987) embo J 6: 
3901-3907) or the uidA gene (encode enzymes for which various 
chromogenic substrates are available) 

R- locus gene product which regulates production of anthocya- 
nin pigments (red color) in plant tissue and thus makes pos- 
sible a direct analysis of the promoter activity without 
addition of additional auxiliary substances or chromogenic 
substrates (Dellaporta et al. (1988) In: Chromosome Structure 
and Function: impact of New Concepts, 18 th Stadler Genetics 
Symposium, 11:263-282) 

tyrosinase (Katz et al.(1983) J Gen Microbiol 129:2703- 
2714), enzyme which oxidizes tyrosine to give DOPA and dopa- 
quinone which consequently form the readily detectable mela- 
nine . 

aequorin ( Prasher et al.(1985) Biochem Biophys Res Commun 
126(3) : 1259-1268) , may be used in calcium-sensitive biolu- 
minescence detection . 

3. Origins of replication which ensure propagation of the ex- 
pression cassettes or vectors of the invention, for example 
in E . coli. Examples which may be mentioned are ORI (origin 
of DNA replication), the pBR322 ori or the P15A ori (Sambrook 
et al.: Molecular Cloning. A Laboratory Manual, 2 nd ed. Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
1989) . 



PF 53790 



CA 02493364 2005-01-21 



58 

4. Elements, for example border sequences, which enable agrobac- 
teria-mediated transfer into plant cells for transfer and in- 
tegration into the plant genome, such as, for example, the 
right or left border of T-DNA or the vir region. 

5 

5. Multiple cloning regions (MCS) allow and facilitate the in- 
sertion of one or more nucleic acid sequences. 

j0 Nucleic acid sequences (e.g. expression cassettes) may be 

introduced into a plant organism or cells, tissues, organs, parts 
or seeds thereof by advantageously using vectors which contain 
said sequences. Vectors may be, by way of example, plasmids, cos- 
mids, phages, viruses or else agrobacteria - The sequences may be 

15 inserted into the vector (preferably a plasmid vector) via suit- 
able restriction cleavage sites. The resulting vector may first 
be introduced into E. coli and amplified. Correctly transformed 
E . coli are selected, grown and the recombinant vector is ob- 
tained using methods familiar to the skilled worker. Restriction 

20 analysis and sequencing may serve to check the cloning step. 

Preference is given to those vectors which make possible a stable 
integration into the host genome. 

The preparation of a transformed organism (or a transformed cell 
25 or tissue) requires that the corresponding DNA (e.g. the trans- 
formation vector) or RNA is introduced into the corresponding 
host cell. For this process which is referred to as transforma- 
tion (or transduction or transf ection) , a multiplicity of methods 
and vectors are available (Keown et al . (1990) Methods in En- 
30 zymology 185:527-537; Plant Molecular Biology and Biotechnology 
(CRC Press, Boca Raton, Florida), Chapter 6/7, pp. 71-119 (1993); 
White FF (1993) Vectors for Gene Transfer in Higher Plants; in: 
Transgenic Plants, Vol. 1, Engineering and Utilization, Editors: 
Kung and Wu R, Academic Press, 15-38; Jenes B et al- (1993) Tech- 
35 niques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engi- 
neering and Utilization, editors: Kung and R. Wu, Academic Press, 
pp. 128-143; Potrykus (1991) Annu Rev Plant Physiol Plant Molec 
Biol 42:205-225; Halford NG, Shewry PR (2000) Br Med Bull 
56(1) :62-73) . 

40 

For example, the DNA or RNA may be introduced directly by micro- 
injection (WO 92/09696, WO 94/00583, EP-A 0 331 083, EP-A 0 175 
966) or by bombardment with DNA or RNA-coded microparticles 
(biolistic processes using the gene gun "particle bombardment"; 
US 5,100,792; EP-A 0 444 882; EP-A 0 434 616; Fromm ME et al . 
(1990) Bio/Technology 8(9):833-9; Gordon-Kamm et al. (1990) Plant 
Cell 2:603). The cell may also be permeabilized chemically, for 
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example with polyethylene glycol, so as to enable the DNA to 
reach the cell by means of diffusion. The DNA may also take place 
by means of protoplast fusion to other DNA-containing units such 
as minicells f cells, lysosomes or liposomes (Freeman et al. 
5 (1984) Plant Cell Physiol. 29:1353ff; US 4,536,475). Electropora- 
tion is another suitable method for introducing DNA, in which the 
cells are permeabilized reversibly by an electric impulse (EP-A 
290 395, WO 87/06614). Further processes comprise the calcium- 
phosphate -mediated transformation, DEAE-dextran-raediated trans- 

10 formation, the incubation of dry embryos in DNA-containing solu- 
tion or other methods of direct introduction of DNA (DE 4 005 
152, wo 90/12096, US 4,684,611). Appropriate processes have been 
described (e.g. in Bilang et al. (1991) Gene 100:247-250; Scheid 
et al. (1991) Mol Gen Genet 228:104-112; Guerche et al. (1987) 

15 Plant Science 52:111-116; Neuhause et al. (1987) Theor Appl Genet 
75:30-36; Klein et al. (1987) Nature 327:70-73; Howell et al. 
(1980) Science 208:1265; Horsch et al.(1985) Science 227:1229- 
1231; DeBlock et al- (19 89) Plant Physiology 91:694-701; Methods 
for Plant Molecular Biology (Weissbach and Weissbach, eds . ) Aca- 

20 demic Press Inc. (19 88); and Methods in Plant Molecular Biology 
(Schuler and Zielinski, eds.) Academic Press Inc. (1989)). Physi- 
cal methods of introducing DNA into plant cells have been re- 
viewed by Oard (1991) Biotech Adv 9:1-11. 

25 

In the case of these "direct" transformation methods, no particu- 
lar requirements are made on the plasmid used. It is possible to 
use simple plasmids such as those of the pUC series, pBR322, 
M13mp series, pACYC184 etc. 

30 

Besides these "direct" transformation techniques, transformation 
may also be carried out by bacterial infection by means of Agro- 
bacterium (e.g. EP 0 116 718), viral infection by means of viral 
vectors (EP 0 067 553; US 4,407,956; WO 95/34668; WO 93/03161) or 
35 by means of pollen (EP 0 270 356; WO 85/01856; US 4,684,611). 



Transformation is preferably carried out by means of agrobacteria 
which contain disarmed Ti-plasmid vectors, using the latters' 
natural ability to transfer genes to plants (EP-A 0 270 355; EP-A 
0 116 718). Agrobacterium transformation is widespread for trans- 
forming dicotyledones, but is also increasingly applied to mono- 
cotyledones (Toriyama et al . (1988) Bio/Technology 6: 1072-1074; 
Zhang et al. (1988) Plant Cell Rep 7:379-384; Zhang et al. (1988) 
Theor Appl Genet 76:835-840; Shimamoto et al. (1989) Nature 
338:274-276; Datta et al. (1990) Bio/Technology 8: 736-740; 
Christou et al. (1991) Bio/Technology 9:957-962; Peng et al. 
(1991) International Rice Research Institute, Manila, Philippines 



PF 53790 



CA 02493364 2005-01-21 



60 

563-574? Cao et al. (1992) Plant Cell Rep 11:585-591; Li et al. 
(1993) Plant Cell Rep 12:250-255; Rathore et al. (1993) Plant Mol 
Biol 21:871-884; Fromm et al. (1990) Bio/Technology 8:833-839; 
Gordon-Kamm et al. (1990) Plant Cell 2:603-618; D'Halluin et al. 
5 (1992) Plant Cell 4:1495-1505; Walters et al. (1992) Plant Mol 
Biol 18:189-200; Koziel et al. (1993) Biotechnology 11:194-200; 
Vasil IK (1994) Plant Mol Biol 25:925-937; Weeks et al. (1993) 
Plant Physiol 102:1077-1084; Somers et al. (1992) Bio/Technology 
10:1589-1594; WO 92/14828; Hiei et al. (1994) Plant J 6:271-282). 

LO 

The strains most often used for agrobacterial transformation, 
Agrobacterium tumefaciens or Agrobacterium rhizogenes f contain a 
plasmid (Ti and Ri plasmids, respectively) r which is transferred 
to the plant after agrobacterial infection. Part of this plasmid, 
15 called T-DNA (transferred DNA) , is integrated into the genome of 
the plant cell. Alternatively, Agrobacterium may also transfer 
binary vectors (mini Ti plasmids) to plants and integrate them 
into the genome of said plants. 

20 

The application of Agrobacterium tumefaciens to the transforma- 
tion of plants, using tissue culture explants, has been described 
(inter alia r Horsch RB et al. (1985) Science 225:1229ff; Fraley 
et al. (1983) Proc Natl Acad Sci USA 80: 4803-4807; Bevans et al. 

25 (1983) Nature 3 04:184-187). Many Agrobacterium tumefaciens 

strains are capable of transferring genetic material, such as, 
for example, the strains EHA101 [pEHAlOl ] , EHA105 [pEHA105 ] , 
LBA4404[pAL4404] , C58Cl[pMP90] and C58C1 [ pGV2260 ] (Hood et al. 
(1993) Transgenic Res 2:208-218; Hoekeraa et al. (1983) Nature 

30 303:179-181; Koncz and Schell (1986) Gen Genet 204:383-396; De- 
blaere et al. (1985) Nucl Acids Res 13: 4777-47 88). 



When using agrobacteria, the expression cassette must be inte- 
grated into special plasmids, either a shuttle or intermediate 
35 vector or a binary vector. When using a Ti or Ri plasmid for 

transformation, then at least the right border, but usually the 
right and left borders of the Ti or Ri plasmid T-DNA are con- 
nected as a flanking region to the expression cassette to be 
introduced- Preference is given to using binary vectors. Binary 

40 vectors may replicate both in E . coli and in agrobacteria and 

contain the components required for transfer into a plant system. 
They normally contain a selection marker gene for selection of 
transformed plants (e.g. the nptll gene which imparts a resist- 
ance to kanamycin) and a linker or polylinker flanked by the 

45 right and left T-DNA border sequences. They contain moreover, 

outside the T-DNA border sequence, also a selection marker which 
enables transformed E • coli and/or agrobacteria to be selected 
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(e.g. the npt.HI gene which imparts a resistance to kanamycin) - 
Corresponding vectors may be transformed directly into Agrobac- 
terium (Holsters et al. (1978) Mol Gen Genet 163:181-187). 

5 

Binary vectors are based, for example, on "broad host range" 
plasmids such as pRK252 (Bevan et al. (1984) Nucl Acid Res 
12,8711-8720) and pTJS75 (Watson et al. (1985) EMBO J 4(2):277- 
284). A large group of the binary vectors used is derived from 
PBIN19 (Bevan et al. (1984) Nucl Acid Res 12:8711-8720). 

10 

Hajdukiewicz et al. developed a binary vector (pPZP) which is 
smaller and more efficient than the previously customary vectors 
(Hajdukiewicz et al. (1994) Plant Mol Biol 25:989-994). Improved 
and particularly preferred binary vector systems for Agrobacter- 
ium-mediated transformation are described in wo 02/00900. 

15 

The agrobacteria transformed with a vector of this kind may then 
be used in the known manner for transforming plants, in particu- 
lar crop plants such as, for example, oilseed rape, for example 

20 by bathing wounded leaves or leaf sections in an agrobacterial 
solution and subsequently culturing them in suitable media. The 
transformation of plants by agrobacteria has been described 
(White FF, Vectors for Gene Transfer in Higher Plants; in Trans- 
genic Plants, Vol. 1, Engineering and Utilization, edited by S.D. 

25 Kung and R. Wu, Academic Press, 1993, pp. 15-38; Jenes B et 
al.(1993) Techniques for Gene Transfer, in: Transgenic Plants, 
Vol. 1, Engineering and Utilization, edited by S.D. Kung and 
R. Wu r Academic Press, pp. 128-143; Potrykus (1991) Annu Rev 
Plant Physiol Plant Molec Biol 42:205-225). Transgenic plants may 

30 be regenerated in the known manner from the transformed cells of 
the wounded leaves or leaf sections . 



Different explants , cell plants, tissues, organs, embryos, seeds, 
microspores or other unicellular or multicellular cellular struc- 
tures derived from a plant organism may be used for transforma- 
tion. Transformation processes adjusted to the particular ex- 
plants, cultures or tissues are known to the skilled worker. 
Examples which may be mentioned are: shoot internodes (Fry J et 
al. (1987) Plant Cell Rep. 6:321-325), hypocotyls (Radke SE et 
al. (1988) Theor Appl Genet 75:685-694; Schroder M et al . (1994) 
Physiologia Plant 92: 37-46.; Stefanov I et al. (1994) Plant Sci. 
95:175-186; Weier et al. (1997) Fett/Lipid 99:160-165), cotyledo- 
nous petioles (Meloney MM et al. (1989) Plant Cell Rep 8:238-242; 
Weier D et al - ( 1998) Molecular Breeding 4:39-46), microspores 
and proembryos (Pechnan (1989) Plant Cell Rep. 8:387-390) and 
flower stalks (Boulter ME et al. (1990) Plant Sci 70:91-99; 
Guerche P et al- (1987) Mol Gen Genet 206:382-386). In the case 
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of a direct gene transfer, mesophyll protoplasts (Chapel PJ & 
Glimelius K (1990) Plant Cell Rep 9: 105-108; Golz et al. (1990) 
Plant Mol Biol 15:475-483) or else hypocotyl protoplasts 
(Bergroann P & Glimelius K (1993) Physiologia Plant 88:604-611) 
5 and microspores (Chen JL et al. (1994) Theor Appl Genet 

88:187-192; Jonesvilleneuve E et al. (1995) Plant Cell Tissue and 
Organ Cult 40:97-100) and shoot sections (Seki M et al. (1991) 
Plant Mol Biol 17:259-263) may be employed successfully. 



Stably transformed cells , i.e. those which contain the introduced 
DNA integrated into the DNA of the host cell, may be selected 
from untransformed cells by using the selection process of the 
invention. The plants obtained may be grown and crossed in the 
usual way. Preferably , two or more generations should be cultured 
in order to ensure that the genomic integration is stable and can 
be inherited. 



As soon as a transformed plant cell has been prepared, it is pos- 
sible to obtain a complete plant by using proceses known to the 
skilled worker. This involves, for example, starting from callus 
cultures, individual cells (e.g. protoplasts) or leaf disks 
(Vasil et al. (19 84) Cell Culture and Somatic Cel Genetics of 
Plants, Vol I, II and III, Laboratory Procedures and Their Ap- 
plications, Academic Press; Weissbach and Weissbach (1989) Meth- 
ods for Plant Molecular Biology P Academic Press). It is possible 
to induce from these still undifferentiated callus cell masses 
the formation of shoot and root in the known manner. The seed- 
lings obtained may be planted out and grown. Appropriate pro- 
cesses have been described (Fennell et al. (1992) Plant Cell Rep. 
lis 567-570; Stoeger et al . (1995) Plant Cell Rep. 14:273-278; 
Jahne et al . (19 94) Theor Appl Genet 89:525-533). 



The efficacy of expressing the transgenically expressed nucleic 
35 acids may be determined, for example, in vitro by s hoot -mer is tern 
propagation using any of the selection methods described above. 
Moreover, changes in the type and level of expression of a target 
gene and the effect on the phenotype of the plant may be tested 
in greenhouse experiments using test plants. 

40 

The process of the invention is preferably used within the frame- 
work of plant biotechnology for generating plants having advanta- 
geous properties. The "nucleic acid sequence to be inserted" into 
the genome of the plant cell or the plant organism preferably 
45 comprises at least one expression cassette, said expression cas- 
sette being able to express, under the control of a promoter 
functional in plant cells or plant organisms, an RHA and/or a 
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protein which do not cause reduction of the expression, amount, 
activity and/or function of a marker protein but, particularly 
preferably, impart to the plant genetically altered in this way 
an advantageous phenotype. Numerous genes and proteins which may 
be used for achieving an advantageous phenotype, for example for 
the increase in quality of foodstuff or for producing particular 
chemicals or pharmaceuticals (Dunwell JM (2000) J Exp Bot 51 Spec 
No: 487-96) are known to the skilled worker. 

Thus it is possible to improve the suitability of the plants or 
the seeds thereof as foodstuff or feedstuff, for exmaple by al- 
tering the compositions and/or the content of metabolites, in 
particular proteins, oils, vitamins and/or starch. It is also 
possible to increase the growth rate, yield or resistance to 
biotic or abiotic stress factors. Advantageous effects may be 
achieved both by transgenic expression of nucleic acids or pro- 
teins and by targeted reduction of the expression of endogenous 
genes, with respect to the phenotype of the transgenic plant. The 
advantageous effects which may be achieved in the transgenic 
plant comprise, for example: 

- increased resistance to pathogens (biotic stress) 

- increased resistance to environmental factors such as heat, 
cold, frost, drought, UV light, oxidative stress, wetness, 
salt, etc. (abiotic stress) 

- increased yield 

improved quality, for example increased nutritional value, 
increased storability 

The invention further relates to the use of the transgenic plants 
prepared according to the process of the invention and of the 
cells, cell cultures, plants or propagation material such as 
seeds or fruits derived from said plants, for preparing foodstuff 
or feedstuff, pharmaceuticals or fine chemicals such as, for ex- 
ample, enzymes, vitamins, amino acids, sugars, fatty acids, natu- 
ral and synthetic flavorings, aroma substances and colorants. 
Particular preference is given to the production of triacyl 
glycerides, lipids, oils, fatty acids, starch, tocopherols and 
tocotrienols and also carotenoids. Genetically modified plants of 
the invention, which may be consumed by humans and animals may 
also be used as foodstuff or feedstuff, for example, directly or 
after preparation known per se. 
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As already mentioned above, the process of the invention com- 
prises in a particularly advantageous embodiment, in a process 
step downstream of the selection r the deletion of the sequence 
coding for the marker protein (e.g. mediated by recombinase or as 
5 described in WO03/004659) or the elimination by crossing and/or 
segregation of said sequences. (It is obvious to the skilled 
worker that, for this purpose, the nucleic acid sequence inte- 
grated into the genome and the sequence coding for the marker 
protein should have a separate chromosomal locus in the trans- 
it) formed cells. This, however, is the case in the majority of the 
resulting plants, merely for reasons of statistics). This proce- 
dure is particularly advantageous if the marker protein is a 
transgene which otherwise does not occur in the plant to be 
transformed. Although the resulting plant may still possibly con- 
15 tain the compound for reducing the expression, amount, activity 
and/or function of the marker protein, said compound would have 
no longer any "counterpart" in the form of said marker protein, 
and thus would have no effect. This is particularly the case if 
the marker protein is derived from a non-plant orqanism and/or is 
20 synthetic (for example the codA protein). It is, however, also 
possible to use plant marker proteins from other plant species, 
which otherwise do not occur in the cell to be transformed (i.e. 
if not introduced as transgene). Said marker proteins are re- 
ferred to as "nonendogenous" marker proteins within the scope of 
25 the present invention. 

Very particularly advantageously, the compound for reducing the 
expression, amount, activity and/or function of the marker pro- 
tein is an RNA. After deletion or elimination by crossing/segre- 

30 gation, the resulting transgenic plant would have no longer any 
unnecessary (and, if appropriate, undesired) foreign protein. The 
sole foreign protein would be possibly the protein resulting from 
the nucleic acid sequence inserted into the genome. For reasons 
of product approval, this embodiment is particularly advanta- 

35 geous. As described above, said RNA may be an antisense RNA or, 
particularly preferably, a double-stranded RNA- It may be ex- 
pressed separately from the RNA coding for the target protein but 
also, possibly, on the same strand as the latter. 

40 In summary, the particularly advantageous embodiment comprises 
the following features: 

A process for preparing transformed plant cells or organisms, 
which comprises the following steps: 

45 

a) transforming a population of plant cells which comprises at 
least one non-endogenous (preferably non-plant) marker pro- 
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tein capable of converting directly or indirectly a substance 
X which is nontoxic for said population of plant cells into a 
substance Y which is toxic for said population, with at least 
one nucleic acid sequence to be inserted in combination with 
5 at least one nucleic acid sequence coding for a ribonucleic 

acid sequence capable of reducing the expression, amount, ac- 
tivity and/or function of said marker protein, and 

b) treating said population of plant cells with the substance X 
10 at a concentration which causes a toxic effect for nontrans- 

formed cells, due to the conversion by the marker protein, 
and 

15 c) selecting transformed plant cells (and/or populations of 

plant cells, such as plant tissues or plants) whose genome 
contains said nucleic acid sequence and which have a growth 
advantage over nontrans formed cells, due to the action of 
said compound, from said population of plant cells, the 
selection being carried out under conditions under which the 
marker protein can exert its toxic effect on the nontrans- 
fonned cells, and 



20 



d) regenerating fertile plants, and 

25 

e) eliminating by crossing the nucleic acid sequence coding for 
the marker protein and isolating fertile plants whose genome 
contains said nucleic acid sequence but does not contain any 
longer the sequence coding for the marker protein. 

30 

Sequences 

SEQ ID NO: 1 Nucleic acid sequence coding for E. coli cytosine 
35 deaminase (codA) 

SEQ ID NO: 2 amino acid sequence coding for E . coli cytosine 

deaminase (codA) 



40 



SEQ ID NO: 3 Nucleic acid sequence coding for E ♦ coli cytosine 

deaminase (codA), with modified start codon (GTG/ 
ATG) for expression in eukaryotes 



SEQ ID NO: 4 



Amino acid sequence coding for E. coli cytosine 
deaminase (codA), with modified start codon (GTG/ 
ATG) for expression in eukaryotes 
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SEQ ID NO: 5 Nucleic acid sequence coding for Streptomyces gri- 

seolus cytochrome P450-SU1 (suaC) 

SEQ ID NO: 6 Amino acid sequence coding for Streptomyces gri- 
5 seolus cytochrome P450-SU1 (suaC) 

SEQ ID NO: 7 Nucleic acid sequence coding for Agrobacterium tu- 

mefaciens indoleacetamide hydrolase (tms2) 

10 SEQ ID NO: 8 Amino acid sequence coding for Agrobacterium tume- 

faciens indoleacetamide hydrolase (tms2) 

SEQ ID NO: 9 Nucleic acid sequence coding for Agrobacterium tu- 
15 mefaciens indoleacetamide hydrolase (tms2) 

SEQ ID NO: 10 Amino acid sequence coding for Agrobacterium tume- 

faciens indoleacetamide hydrolase (tms2) 

20 SEQ ID NO: 11 Nucleic acid sequence coding for Xanthobacter au- 
totrophics haloalkane dehalogenase (dhlA) 

SEQ ID NO: 12 Amino acid sequence coding for xanthobacter auto- 
25 trophicus haloalkane dehalogenase (dhlA) 

SEQ ID NO: 13 Nucleic acid sequence coding for Herpes simplex 

Virus 1 thymidine kinase 

30 SEQ ID NO: 14 Amino acid sequence coding for Herpes simplex Vi- 
rus 1 thymidine kinase 

SEQ ID NO: 15 Nucleic acid sequence coding for Herpes simplex 

Virus 1 thymidine kinase 

3 5 

SEQ ID NO: 16 Amino acid sequence coding for Herpes simplex Vi- 
rus 1 thymidine kinase 

40 SEQ ID NO: 17 Nucleic acid sequence coding for Toxoplasma gondi 

hypoxanthine-xanthine-guanine phosphoribosyl 

transferase 

SEQ ID NO: 18 Amino acid sequence coding for Toxoplasma gondii 
45 hypoxanthine-xanthine-guanine phosphoribosyl 

transferase 
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SEQ ID NO: 19 
SEQ ID NO: 20 

5 

SEQ ID NO: 21 
10 SEQ ID NO: 22 
SEQ ID NO: 23 

15 

SEQ ID NO: 24 
20 SEQ ID NO: 25 
SEQ ID NO: 26 

25 

SEQ ID NO: 27 
3Q SEQ ID NO: 28 
SEQ ID NO: 29 

35 

SEQ ID NO: 30 
SEQ ID NO: 31 

40 

SEQ ID NO: 32 
SEQ ID NO: 33 
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Nucleic acid sequence coding for E. coli xanthine- 
guanine phosphor ibosyl transferase 

Amino acid sequence coding for E • coli xanthine- 
guanine phosphoribosyl transferase 

Nucleic acid sequence coding for E • coli xanthine- 
guanine phosphoribosyl transferase 

Amino acid sequence coding for E. coli xanthine- 
guanine phosphoribosyl transferase 

Nucleic acid sequence coding for E • coli purine 
nucleoside phosphorylase (deoD) 

Nucleic acid sequence coding for E. coli purine 
nucleoside phosphorylase (deoD) 

Nucleic acid sequence coding for Burkholderia ca- 
ryophylli phosphonate monoester hydrolase (pehA) 

Amino acid sequence coding for Burkholderia caryo- 
phylli phosphonate monoester hydrolase (pehA) 

Nucleic acid sequence coding for Agrobacterium 
rhizogenes tryptophan oxygenase (auxl) 

Amino acid sequence coding for Agrobacterium rhi- 
zogenes tryptophan oxygenase (auxl) 

Nucleic acid seuence coding for Agrobacterium rhi 
zogenes indoleacetamide hydrolase (aux2) 

Amino acid seuence coding for Agrobacterium rhizo 
genes indoleacetamide hydrolase (aux2) 

Nucleic acid sequence coding for Agrobacterium tu 
mefaciens tryptophan oxygenase (auxl) 

Amino acid sequence coding for Agrobacterium tume 
faciens tryptophan oxygenase (auxl) 

Nucleic acid sequence coding for Agrobacterium tu 
mefaciens indoleacetamide hydrolase (aux2) 
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SEQ ID NO: 34 Amino acid sequence coding for Agrobacterium tume- 

faciens indoleacetamide hydrolase (aux2) 

SEQ ID NO: 35 Nucleic acid sequence coding for Agrobacterium vi- 
5 tis indoleacetamide hydrolase (aux2) 

SEQ ID NO: 3 6 Amino acid sequence coding for Agrobacterium vitis 

indoleacetamide hydrolase (aux2) 

10 SEQ ID NO: 37 Nucleic acid sequence coding for Arabidopsis thai- 

iana 5-methylthioribose kinase (mtrK) 

SEQ ID NO: 38 Amino acid sequence coding for Arabidopsis thalia- 

na 5-methylthioribose kinase (mtrK) 

1 5 

SEQ ID NO: 39 Nucleic acid sequence coding for Klebsiella pneu- 
moniae 5-methylthioribose kinase (mtrK) 

20 SEQ ID NO: 40 Amino acid sequence coding for Klebsiella pneumo- 
niae 5-methylthioribose kinase (mtrK) 

SEQ ID NO: 41 Nucleic acid sequence coding for Arabidopsis thal- 

iana alcohol dehydrogenase (adh) 

25 

SEQ ID NO: 42 Amino acid sequence coding for Arabidopsis thalia- 

na alcohol dehydrogenase (adh) 

SEQ ID NO: 4 3 Nucleic acid sequence coding for Hordeum vulgare 

(barley) alcohol dehydrogenase (adh) 

SEQ ID NO: 4 4 Amino acid sequence coding for Hordeum vulgare 

(barley) alcohol dehydrogenase (adh) 

35 

SEQ ID NO: 45 Nucleic acid sequence coding for Oryza sativa 

(rice) alcohol dehydrogenase (adh) 

SEQ ID NO: 46 Amino acid sequence coding for Oryza sativa (rice) 
40 alcohol dehydrogenase (adh) 

SEQ ID NO: 47 Nucleic acid sequence coding for Zea mays (corn) 

alcohol dehydrogenase (adh) 

45 SEQ ID NO: 48 Amino acid sequence coding for Zea mays (corn) al- 
cohol dehydrogenase (adh) 
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SEQ ID NO: 49 



5 SEQ ID NO: 50 

SEQ ID NO: 51 

10 

SEQ ID NO: 52 

15 

SEQ ID NO: 53 

SEQ ID NO: 54 

20 

SEQ ID NO: 55 

SEQ ID NO: 5 6 

25 

SEQ ID NO: 57 

SEQ ID NO: 58 

30 

SEQ ID NO: 59 

35 SEQ ID NO: 60 

SEQ ID NO: 61 

40 

SEQ ID NO: 62 

45 

SEQ ID NO: 63 
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Nukleic acid sequence coding for a sense RNA frag- 
ment of E. coli cytosine deaminase (codARNAi- 
sense) 

Oligonucleotide primer codA5 'Hindlll 
5 ' -AAGCTTGGCTAACAGTGTCGAATAACG-3 ' 

Oligonucleotide primer codA3'SalI 
5 ' -GTCGACGACAAAATCCCTTCCTGAGG-3 ' 

Nucleic acid sequence coding for an antisense RNA 
fragment of E. coli cytosine deaminase (codARNAi- 
anti) 

Oligonucleotide primer codLA5'EcoRI 
5 ' -GAATTCGGCTAACAGTGTCGAATAACG-3 ' 

Oligonucleotide primer codA3'BamHI 
5 ' -GGATCCG ACAAAATCCC TTCC TGAGG - 3 ' 

Vector construct pBluKS-nitP-STLSl-35S-T 

Expression vector pSUN-1 

Transgenic expression vector pSUN-l-codA-RNAi 

Transgenic expression vector pSUNl-codA-RNAi- 
At . Act . -2 -At . Al S -R-OC ST 

Nukleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from corn ( Zea mays); fragment 

Amino acid sequence coding for 5-methylthioribose 
kinase (mtrK) from corn (Zea mays); fragment 

Nucleic acid sequence coding for 5-methylthiori— 
bose kinase (mtrK) from oilseed rape (Brassica na« 
pus ) , fragment 

Amino acid sequence coding for 5-methylthioribose 
kinase (mtrK) from oilseed rape (Brassica napus), 
fragment 

Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from oilseed rape (Brassica na- 
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pus), fragment 

SEQ ID NO: 64 Amino acid sequence coding for 5-methylthioribose 

kinase (mtrK) from oilseed rape (Brassica napus), 
5 fragment 

SEQ ID NO: 65 Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from rice (Oryza sativa) , frag 
ment 

10 

SEQ ID NO: 66 Amino acid sequence coding for 5-methylthioribose 

kinase (mtrK) from rice (Oryza sativa), fragment 

15 SEQ ID NO: 67 Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from soybean (Glycine max), 
fragment v 

SEQ ID NO: 68 Amino acid sequence coding for 5-methylthioribose 
20 kinase (mtrK) from soybean (Glycine max), fragment 

SEQ ID NO: 69 Oligonucleotide primer codA5'C-term 

5 ' -C GTGAAT AC GGCGTGGAGTC G— 3 ' 

25 

SEQ ID NO: 70 Oligonucleotide primer codA3'C-term 

5 ' - C GGC AGGAT AATC AGGTTGG- 3 ' 

SEQ ID NO: 71 Oligonucleotide primer 35sT 5' primer 
30 5 ' -GTCAACGTAACCAACCCTGC-3 ' 



35 



40 



45 
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Figures 



15 



Fig.l: Inactivation of the marker protein gene by means of 

introducing a recombinase 

5 

P: promoter 

MP: Sequence coding for a marker protein 

R1/R2: Recombinase recognition sequences 

R: Recombinase or sequence coding for 

10 recombinase. 

In a preferred embodiment f the marker protein gene is in- 
activated by introducing a sequence-specific recombinase. 
Preference is given to its expressing the recombinase, as 
depicted here, starting from an expression cassette. 

The marker protein gene is flanked by recognition se- 
quences for sequence-specific recombinases, with se- 
20 quences of said marker protein gene being deleted by 

introducing said recombinase and thus said marker protein 
gene being inactivated. 

Fig.2-A: Inactivation of the marker protein gene by the action of 
25 a sequence-specific nuclease 

p : promoter 

DS: Recognition sequence for targeted induction of 

DNA double-strand breaks 
MP-DS-MP': Sequence coding for a marker protein, 

comprising a DS 
nDS: Inactivated DS 

E; Sequence-specific enzyme for targeted 

induction of DNA double-strand breaks 



30 



35 



40 



45 



The marker protein gene may be established by a targeted 
mutation or deletion in the marker protein gene, for ex- 
ample by sequence-specific induction of DNA double-strand 
breaks at a recognition sequence for targeted induction 
of DNA double-strand breaks in or close to the marker 
protein gene (P-MP). The double-strand break may occur in 
the coding region or else the noncoding (such as, for ex- 
ample, the promoter) region, induces an illegitimate re- 
combination (nonhomologous DNA-end joining) and thus, for 
example, a shift in the reading frame of said marker pro- 
tein. 



* 
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Fig.2-B: Inactivation of "the marker protein gene by the action of 

a sequence-specific nuclease 

P : promoter 

5 DS: Recognition sequence for targeted induction 

of DNA double-strand breaks 

MP- Sequence coding for a marker protein 

nDS: Inactivated DS 

E: sequence-specific enzyme for targeted 

10 induction of DNA double-strand breaks 

The marker protein gene may be established by a targeted 
deletion by sequence-specific induction of more than one 
sequence-specific DNA double-strand break in or close to 
said marker protein gene. The double-strand breaks may 
occur in the coding region or else the noncoding (such 
aSf for example, the promoter) region and induce a dele- 
tion in the marker protein gene. The marker protein gene 
is preferably flanked by DS sequences and is completely 
deleted by the action of enzyme E. 



15 



20 



Fig. 3: Inactivation of the marker protein gene by inducing an 

intramolecular homologous recombination, due to the ac- 
2 5 tion of a sequence-specific nuclease 

A/A' : Sequences with a sufficient length and homolo- 

gy to one another , in order to recombine with 
one another as a consequence of the induced 
30 double-strand break 

P • promoter 

D s: Recognition sequence for targeted induction 

of DNA double-strand breaks 
M p. Sequence coding for a marker protein 

35 E . sequence-specific enzyme for targeted 

induction of DNA double-strand breaks 

The marker protein gene may be inactivated by a deletion 
by means of intramolecular homologous recombination. Said 

40 homologous recombination may be initiated by sequence- 

specific induction of DNA double-strand breaks at a rec- 
ognition sequence for targeted induction of DNA double- 
strand breaks in or close to the marker protein gene. The 
homologous recombination occurs between the sequences A 

45 an d A' which have a sufficient length and homology to one 

another in order to recombine with one another as a con- 
sequence of the induced double-strand break. The recom- 



* 
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bination causes a deletion of essential sequences of the 
marker protein gene. 

Fig. 4: Inactivation of the marker protein gene by intermolecular 

homologous recombination 

A/A' : Sequences with a sufficient length and homolo- 

gy to one another in order to recombine with 
one another 

B/B': Sequences with a sufficient length and 

homology to one another in order to recombine 
with one another 
P: promoter 

I: nucleic acid sequence/gene of interest to be 

inserted 

MP: Sequence coding for a marker protein 

The marker protein gene (P-MP) may also be inactivated by 
20 a targeted insertion into the marker protein gene, for 

example by means of intermolecular homologous recombina- 
tion. In this context, the region to be inserted is 
flanked on its 5 r and 3' ends by nucleic acid sequences 
(A' and B', respectively), which have a sufficient length 
25 and homology to corresponding flanking sequences of the 

marker protein (A and B, respectively) in order to make 
possible a homologous recombination between A and A' and 
B and B' . The recombination causes a deletion of essen- 
tial sequences of the marker protein gene. 



30 



35 



40 



45 



Fig. 5: Inactivation of the marker protein gene by intermolecular 
homologous recombination due to the action of a sequence- 
specific nuclease 

A/A' s Sequences with a sufficient length and homolo- 

gy to one another in order to recombine with 
one another 

B/B': Sequences with a sufficient length and 

homology to one another in order to recombine 
with one another 

p : promoter 

I: nucleic acid sequence/gene of interest to be 

inserted 

MP : Sequence coding for a marker protein 

DS: Recognition sequence for targeted induction 

of DNA double-strand breaks 
E: Sequence-specific enzyme for targeted 



» 
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induction of DNA double-strand breaks 

The marker protein gene may also be inactivated by a tar- 
geted insertion into the marker protein gene, for example 
5 by means of intermolecular homologous recombination. The 

homologous recombination may be initiated by sequence- 
specific induction of DNA double-strand breaks at a rec- 
ognition sequence for targeted induction of DNA double- 
strand breaks in or close to the marker protein gene. In 

10 this context, the region to be inserted is flanked at its 

5' and 3 r ends by nucleic acid sequences (A r and B F , re- 
spectively) which have a sufficient length and homology 
to corresponding flanking sequences of the marker protein 
gene (A and B, respectively) in order to make possible a 

15 homologous recombination between A and A' and B and B'. 

The recombination causes a deletion of essential se- 
quences of the marker protein gene. 

Fig. 6; Vector map for pBluKS-nitP-STLSl-35S-T (SEQ ID NO: 55) 

20 

NitP: promoter of the A- thaliana nitrilasel gene (Gen- 
Bank Acc. No.: Y07648.2, Hillebrand et al. (1996) Gene 
170:197-200) 

25 

STLS-l intron: intron of the potato ST-LS1 gene (Vancan- 
neyt GF et al- (1990) Mol Gen Genet 220 ( 2 ): 245-250 ) . 

35S-Terra: Terminator of the 35S CaMV gene (cauliflower 
30 mosaic virus; Franck et al- (1980) Cell 21:285-294). 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. 

35 Fig. 7: Vector map for the transgenic expression vector 

pSUN-l-codA-RNAi (SEQ ID NO: 57) 

NitP: promoter of the A. thaliana nitrilasel gene (Gen- 
Bank Acc. No.: Y07648.2, Hillebrand et al. (1996) Gene 
40 170:197-200) 

STLS-1 intron: intron of the potato ST-LS1 gene (Vancan- 
neyt GF et al. (1990) Mol Gen Genet 220 ( 2 ) :245-250 ) . 



45 



35S-Term: Terminator of the 3 5S CaMV gene (cauliflower 
mosaic virus; Franck et al. (1980) Cell 21:285-294). 
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codA-sense: Nucleic acid sequence coding for a sense RNA 
fragment of E. coli cytosine deaminase (codARNAi-sense; 
SEQ ID NO: 49) 

codA-anti: Nucleic acid sequence coding for an antisense 
RNA fragment, of E. coli cytosine deaminase (codARNAi- 
anti; SEQ ID NO: 52) 

IiB/RB : Left and, respectively, right boundaries of Agro- 
bacterium T-DNA 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. Fur- 
ther elements represent customary elements of a binary 
Agrobacterium vector (aadA; ColEl; repA) 

Fig. 8: Vector map for the transgenic expression vector 

pSUNl-COdA-RNAi-At .Act .-2-At .Als-R-ocsT (SEQ ID NO: 58) 

NitPi promoter of the A. thaliana nitrilasel gene (Gen- 
Bank Acc. No.: Y07648.2, Hillebrand et al . (1996) Gene 
170:197-200) 

STLS-1 intron: intron of the potato ST-LSl gene (Vancan- 
neyt GF et al. (1990) Mol Gen Genet 220 ( 2 ): 245-250 ) . 

35S-Term: Terminator of the 35S CaMV gene (cauliflower 
mosaic virus; Franck et al. (1980) Cell 21:285-294). 

codA-sense: Nucleic acid sequence coding for a sense RNA 
fragment of E. coli cytosine deaminase ( codARNAi- sense ; 
SEQ ID NO: 49) 

codA-anti: Nucleic acid sequence coding for an antisense 
RNA fragment of E. coli cytosine deaminase (codARNAi- 
anti; SEQ ID NO: 52) 

Left border/right border: Left and f respectively, right 
boundaries of Agrobacterium T-DNA 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. Fur- 
ther elements represent customary elements of a binary 
Agrobacterium vector (aadA; ColEl; repA) 
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Fig.9a-b: Sequence comparison of various 5-methylthioribose (MTR) 

kinases from various organisms, in particular plant or- 
ganisms. Sequences from Klebsiella pneumoniae, Clostri- 
dium tetani, Arabidopsis thaliana (A.thaliana) , oilseed 
5 rape (Brassica napus), soybean (Soy-1), rice (Oryza sati- 

va-1) and also the consensus sequence (Consensus) are 
shown. Homologous regions can be readily deduced from the 
consensus sequence • 
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Exemplary embodiments 
General methods 

The chemical synthesis of oligonucleotides may be carried out, 
for example, in the known manner by using the phosphoamide method 
(Voet, Voet, 2 nd Edition, Wiley Press New York, pages 896-897). 
The cloning steps carried out within the scope of the present in- 
vention, such as, for example, restriction cleavages, agarose gel 
electrophoresis, purification of DNA fragments, transfer of nu- 
cleic acids to nitrocellulose and nylon membranes, linking of DNA 
fragments, transformation of E. coli cells, cultivation of 
bacteria, propagation of phages and sequence analysis of recombi- 
nant DNA, are carried out as described in Sambrook et al. (1989) 
Cold Spring Harbor Laboratory Press; ISBN 0-87969-309-6. The se- 
quencing of recombinant DNA molecules was carried out using a la- 
ser fluorescence DNA sequencer from ABI, according to the method 
of Sanger (Sanger et al, (1977) Proc Natl Acad Sci 
USA 74:5463-5467) . 

Example 1: Preparation of codA fragments 



First, a truncated nucleic acid variant of the codA gene, modi- 
25 fied by the addition of recognition sequences of the restriction 
enzymes Hindlll and Sail, is prepared using the PCR technique. 
For this purpose, part of the codA gene (GeneBank Acc. No.: 
S56903; SEQ ID NO: 1) is amplified from the E. coli source organ- 
ism by means of the polymerase chain reaction (PCR) using a 
30 sense-specific primer (codA5 'Hindlll; SEQ ID NO: 50) and an anti« 
sense-specific primer (codA3'SalI; SEQ ID NO: 51) • 

codA5 'Hindlll: 5 ' -AAGCTTGGCTAACAGTGTCGAATAACG-3 ' (SEQ ID NO: 50) 

35 codA3'SalI: 5 ' -GTCGACGACAAAATCCCTTCCTGAGG-3 ' (SEQ ID NO: 51) 

The PCR was carried out in 50 ^il reaction mixture which con- 
tained: 

— 2 pJ. (200 ng) of E. coll genomic DNA 

40 

0.2 mM dATP, dTTP, dGTP, dCTP 

1.5 mM Mg(0Ac) 2 
5 \xg of bovine serum albumin 
40 pmol of "codA5 'Hindlll" primer 
45 - 40 pmol of "codA3 ' Sail" primer 

15 pi of 3.3x rTth DNA Polymerase XLPuffer (PE Applied 
Biosystems ) 
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5U of rTth DNA polymerase XL (PE Applied Biosystems) 

The PCR is carried out under the following cycle conditions: 

Step Is 5 minutes 94°C ( denaturation) 

Step 2: 3 seconds 94°C 

Step 3: 1 minute 60°C (annealing) 

Step 4: 2 minutes 72°C (elongation) 

30 repeats of steps 2 to 4 

Step 5: 10 minutes 72°C (post elongation) 
Step 6: 4°C (waiting loop) 



The amplicon ( codARNAi- sense ; SEQ XD NO: 49) is cloned using 
standard methods into the PCR cloning vector pGEM-T (Promega) . 
The identity of the amplicon generated is confirmed by sequencing 
20 using the M13F (-40) primer. 

Another truncated fragment of the codA gene, modified by the 
addition of recognition sequences of the restriction enzymes Eco- 
RI and BamHI, is amplified using a sense-specific primer 
25 (C odA5'EcoRI? SEQ ID NO: 53) and an antisense-specif ic primer 
(codA3'BamHI; SEQ ID NO: 54). 

COdAS'EcoRI: 5 ' -GAATTCGGCTAACAGTGTCGAATAACG—3 ' (SEQ ID NO: 53) 
30 codA3' BamHI: 5 ' -GGATCCGACAAAATCCCTTCCTGAGG-3 * (SEQ ID NO: 54) 

The PCR was carried out in 50 ^1 reaction mixture which con- 
tained: 

35 _ 2 jxl (200 ng) of E . coll genomic DNA 

0.2 mM dATP, dTTP, dGTP , dCTP 

1.5 mM Mg(OAc) 2 
5 ng of bovine serum albumin 
4Q _ 40 pmol of "codAS'EcoRI" primer 

40 pmol of "codA3 'BamHI" primer 

15 [il of 3.3x rTth DNA Polymerase XLPuffer (PE Applied 
Biosystems ) 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 

45 

The PCR is carried out under the following cycle conditions: 
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Step 1: 5 minutes 94°C (denaturation) 
Step 2: 3 seconds 94°C 
Step 3: 1 minute 60°C (annealing) 
Step 4: 2 minutes 72°C (elongation) 

30 repeats of steps 2 to 4 

Step 5: 10 minutes 72°C (post elongation) 
Step 6: 4°C (waiting loop) 

The amplicon (codARNAi-anti; SEQ ID NO: 52) is cloned using stan- 
dard methods into the PCR cloning vector pGEM-T (Promega) . The 
identity of the amplicon generated is confirmed by sequencing us- 
ing the M13F (-40) primer. 



Example 2 Preparation of the transgenic expression vector for 

expressing a codA double-stranded RNA 

0 

The codA fragments generated in example 1 are used for preparing 
a DNA construct suitable for expressing a double-stranded codA 
RNA (pSUN-codA-RNAi) . The construct is suitable for reducing the 
steady-state RNA level of the codA gene in transgenic plants and, 
5 as a result therefrom, suppressing codA gene expression by using 
the double-strand RNA interference (dsRNAi) technique. For this 
purpose, the codA RNAi cassette is first constructed in the plas- 
mid pBluKS-nitP-STLSl-3 5S-T and then, in a further cloning step, 
completely transferred to the pSUN-1 plasmid. 

0 

The vector pBluKS-nitP-STLSl-35S-T (SEQ ID NO: 55) is a deriva- 
tive of pBluescript KS (Stratagene) and contains the promoter of 
the A. thaliana nitrilasel gene (GenBank Acc . No.: Y07648.2, nu- 
cleotides 2456 to 4340, Hillebrand et al. (1996) Gene 

5 170:197-200), the STLS-1 intron (Vancanneyt GF et al. (1990) Mol 
Gen Genet 220 ( 2 ): 245-250 ) , restriction cleavage sites flanking 
the intron on its 5' and 3 r sides and enabling DNA fragments to 
be inserted in a directed manner, and the terminator of the 35S 
CaMV gene (cauliflower mosaic virus; Franck et al. (1980) Cell 

0 21:285-294). Using these restriction cleavage sites (Hindlll, 

Sail, EcoRI, BamHI), the fragments codARNAi -sense (SEQ ID NO: 49) 
and codARNAi-anti (SEQ ID NO: 52) are inserted into said vector, 
thereby producing the finished codA RNAi cassette. 

For this purpose, the codA sense fragment ( codARNAi- sense SEQ ID 
NO: 49) is first excised from the pGEM-T vector, using the en- 
zymes Hindlll and Sail, isolated and ligated into the pBluKS- 
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nitP-STLSl-35S-T vector under standard conditions. This vector 
had previously been cleaved using the restriction enzymes Hadlll 
and Sail. Correspondingly positive clones are identified by ana- 
lytical restriction digest and sequencing. 

5 The vector obtained ( P BluKS-nitP-codAsense-STI.Sl-35S-T) is di- 
gested using the restriction enzymese BamHI and EcoRI. The codA- 
anti fragment (codARNAi-anti; SEQ ID NO: 52) is excised from the 
corresponding pGEM-T vector, using BamHI and EcoRI, isolated and 

10 ligated into the cut vector under standard conditions. Corre- 
spondingly positive clones which contain the complete codA— RNA i 
cassette (pBluKS-nitP-codAsense-STLSl-codAanti-35S-T) are identi- 
fied by analytical restriction digest and sequencing. 

15 The codA-RNAi cassette is transferred into the pSUN-1 vector 

(SEQ ID NO: 56) by using the SacI and Kpnl restrictl °V ^pli 
sites flanking the cassette. The resulting vector pSUNl-codA-RNAi 
<see Fig. 7; SEQ ID NO-. 57) is used for transforming transgenic 
I. thaliana plants which express an active codA gene C-~b.l«0; 
20 The plant expression vector pSUN-1 is particularly suitable with- 
in the scope of the process of the invention, since it does not 
contain any other positive selection marker. 

25 The resulting vector, pSUNl-codA-RNAi, enables an artificial 
2 codA-dsRNA variant consisting of two identical nucleic acid el- 
me nts which are separated by an intron and inverted 
er to be constitutively expressed. Transcription of this artiii 
cial codA— dsRNA variant results in the formation of a 
double-stranded RNA molecule, owing to the complementarity of the 
30 Averted nucleic acid elements. The presence of this molecule in- 
duces the suppression of codA gene expression (accumulation of 
RNA) by means of double-strand RNA interference. 

35 Example 4: Preparation of transgenic Arabidopis thaliana plants 

Transgenic Arabidopsis thaliana plants which express transgeni- 
cally the E. coli codA gene as a marker protein ("A. 
thaliana- [ codA)"), were prepared as described (Kink et al. 
40 (2000) EMBO J 19(20) :5562-6) . 

The A. thaliana-[codA] plants are transformed with an ^°^ter- 
ium tumefaciens strain (GV3101 [ P MP90,) on the basis of a modi- 
fied vacuum infiltration method (Clough S & Bent A (1998) Plant J 
45 16161-735-43/ Bechtold N et al. (1993) CR Acad Sci Pans 

1144 2). 204-212). The Agrobacterium tumefaciens cells used have 
previously been transformed with the DNA construct described 
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(pSUNl-codA-RNAi) . In this way, double transgenic A. 
thaliana- [codA] plants are generated which express an artificial 
codA double-stranded RNA under the control of the constitutive 
nitrilasel promoter- Expression of the codA gene is suppressed as 
5 a consequence of the dsRNAi effect induced by the presence of 
this artificial codA-dsRNA. Said double transgenic plants may be 
identified owing to their regained ability to grow in the pres- 
ence of 5-f luorocytosine in the culture medium. 



10 Seeds of primary trans formants are selected on the basis of the 
regained ability to grow in the presence of 5-f luorocytosine. For 
this purpose, the Tl seeds of the primary trans formants are laid 
out on selection medium containing 200 u.g/ml 5-f luorocytosine . 
These selection plates are incubated under long-day conditions 
15 (16 h of light, 21°C/8 h of darkness, 18°C). Seedlings which de- 
velop normally in the presence of 5-f luorocytosine are separated 
after 7 days and transferred to new selection plates. These 
plates are incubated for another 14 under unchanged conditions. 
The resistant seedlings are then transplanted into soil and cul- 
tured under short-day conditions (8 h of light, 21°C/16 h of dark- 
ness, 18°C). After 14 days, the young plants are transferred to 
the greenhouse and cultured under short-day conditions. 



20 



Example 5: Preparation of a plant transformation vector contain- 
ing an expression cassette for expressing a double- 
stranded codA RNA and a plant selection marker 



30 A plant selection marker consisting of a mutated variant of the 
A. thaliana Als gene, coding for the acetolactate synthase under 
the control of the promoter of the A. thaliana actin-2 gene 
(Meagher RB & Williamson RE (1994) The plant cytoskeleton. 
In The Plant Cytoskeleton (Meyerowitz, E- & Somerville, C, eds), 
pp. 1049-1084. Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York), and the octopine synthase terminator (GIELEN J 
et al.(1984) EMBO J 3:835-846) is inserted into pSUNl-codA-RNAi 
(see Fig. 7; SEQ ID NO: 57) ( At . Act . -2-At . Als-R-ocsT) . 



35 



40 



45 



For this purpose, the pSUNl-codA-RNAi vector is first linearized 
using the restriction enzyme Pvu II. Subsequently, a linear DNA 
fragment with blunt ends, coding for a mutated variant of the 
acetolactate synthase (Als-R gene), is ligated into said linea- 
rized vector under standard conditions. Prior to ligation, this 
DNA fragment has been digested with the restriction enzyme Kpnl 
and the protruding ends have been converted into blunt ends by 
treatment with Pwo DNA polymerase (Roche) according to the 
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manufacturer's instructions. This mutated variant of the A. thal- 
iana Als gene cannot be inhibited by herbicides of the imidazoli- 
none type. By expressing this mutated A.tAls-R gene, the plants 
obtain the ability to grow in the presence of the herbicide Pur- 
5 suit"". Correspondingly positive clones (pSUNl-codA-RNAi- 

At.Act.-2-At.Als-R-ocsT; EEQ ID NO: 57) are identified by analyt- 
ical restriction digest and sequencing. 

The vector obtained enables an artificial codA RNA variant (con- 
10 sisting of two identical nucleic acid elements which are sepa- 
rated by an intron and inverted to one another) and a mutated 
variant of the A. thaliana Als gene to be expressed constitutive- 
ly. Transcription of this artificial codA RNA variant results in 
the formation of a double- stranded RNA molecule, owing to the 

15 complementarity of the inverted nucleic acid elements. The pres- 
ence of this molecule induces the suppression of codA gene ex- 
pression (accummulation of RNA) by means of double-strand RNA in- 
terference. Expression of the Als-R gene imparts to the plants 
the ability to grow in the presence of herbicides of the imidazo- 

20 linone type . 

Example 6: Preparation of transgenic Arabidopis thaliana plants 

Transgenic Arabidopsis thaliana plants expressing the E. coli 
25 codA gene as a marker protein ( "A. thaliana— [codA] * ) were prepared 
as described (Kirik et al.(2000) EMBO J 19(20) : 5562-6 ) . 

The A.thaliana-[codAl plants are transformed with an Agrobacter- 
, 0 ium tumefaciens strain (GV3101 l P MP90]) on the basis of a modi- 
fied vacuum infiltration method (Clough S & Bent A (1998) Plant J 
16(6) :735-43; Bechtold N et al. (1993) CR Acad Sci Paris 
1144(2) :204-212) . The Agrobacterium tumefaciens cells used have 
previously been transformed with the DNA construct described 
35 (pSUNl-codA-RNAi-At.Act.-2-At.Als-R-ocsT; SEQ ID NO: 57). In this 
way, double transgenic A. thaliana- [ codA] plants are generated 
which additionally express an artificial codA double-stranded RNA 
and a herbicide-insensitive variant of the Als gene (Als-R) under 
the control of the constitutive nitrilasel promoter (A.thalia- 

40 na-(codA]-[codA-RNAi-At.Act.-2-At.Als-R-ocsT]). Expression of the 
codA gene is suppressed as a consequence of the dsRNAi effect in- 
duced by the presence of this artificial codA-dsRNA. These double 
transgenic plants may be identified owing to their regained abil- 
ity to grow in the presence of 5-f luorocytosine in the culture 

45 medium. In addition, positively transformed plants can be se- 
lected owing to their ability to grow in the presence of the her- 
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bicide Pursuit in the culture medium. 

For the purpose of selection, the Tl seeds of primary transfor- 
mants are therefore laid out on selection medium containing 
5 100 [xq/ml 5-f luorocytosine - These selection plates are incubated 
under long-day conditions (16 h of light, 21°C/8 h of darkness, 
18°C) . Seedlings which develop normally in the presence of 5-f luo- 
rocytosine are separated after 28 days and transferred to new 
selection plates. These plates are incubated for another 14 days 
^ under unchanged conditions- The resistant seedlings are then 
transplanted into soil and cultured under short-day conditions 
(8 h of light, 21°C/16 h of darkness, 18°C) - After a further 14 
days, the young plants are transferred to the greenhouse and cul- 
tured under short-day conditions. 

15 

In addition, seeds of the primary transf orraants , owing to their 
ability to grow in the presence of the herbicide Pursuit 1 ** , may be 
selected. It is furthermore possible to carry out dual selection 

20 using the herbicide Pursuit™ and 5-f luorocytosine* For this pur- 
pose, the Tl seeds of primary transf orraants are laid out on 
selection medium containing the herbicide Pursuit" - at a con- 
centration of 100 nM (in the case of dual selection, 100 ug/ml 
5-f luorocytosine is likewise present). These selection plates are 

25 incubated under long-day conditions (16 h of light, 21°C/8 h of 
darkness, 18°C). 



Seedlings which develop normally in the presence of Pursuit™ 
(Pursuit™ and 5-f luorocytosine ) are separated after 2 8 days and 
transferred to new selection plat.es. These plates are incubated 
under unchanged conditions for another 14 days. The resistant 
seedlings are then transplanted into soil and cultured under 
short-day conditions (8 h of light, 21°C/16 h of darkness, 18°C). 
After 14 days, the young plants are transferred to the greenhouse 
and cultured under short-day conditions. 



Example 7: Analysis of the double transgenic A. thaliana plants 

selected using 5-f luorocytosine and/or Pursuit 
40 (A.thaliana-[codA]-[codA-RNAi- At . Act . -2-At . Als- 

R-ocsT] ) 

Integration of the T-DNA region of the vector used for trans- 
formation, pSUNl-codA-RNAi-A.tAls-R, into the genomic DNA of the 
45 starting plant <A.thaliana-[codA] ) and the loss of codA-specif ic 
mRNA in these transgenic plants ( A. thaliana- [ codA] -[ codA- RNAi- 
At .Act ,-2-At. Als-R-ocsT] ) can be detected by applying Southern 
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analyses and PCR techniques or Northern analyses. 

in order to carry out said analyses, total RNA and DNA are iso- 
lated from leaf tissue of the transgenic plants and suitable con- 
5 trols (using the RNeasy Maxi Kit (RNA) and Dneasy Plant Maxi Kit 
(genomic DNA), respectively, according to the manufacturer's in- 
formation by Qiagen) . 

in the PCR analyses, the genomic DNA may be used directly as a 
basis (template) for the PCR. Total RNA is transcribed to cDNA 
10 prior to the PCR. The cDNA synthesis is carried out using the re- 
verse transcriptase Superscript II (Invitrogen) according to the 
manufacturer's information. 

Example 8 : Detection of the reduction in the steady-state amount 
15 0 f c odA RNA in the positively selected double trans- 

genic plants (A.thaliana [codA]-[codA-RNAi- 
At.Act.-2-At.Als-R-ocsT) ) in comparison with the 
starting plants (A.thaliana [codA] ) used for trans- 
formation, by means of cDNA synthesis with subsequent 
PCR amplification. 



20 



PCR amplification of the codA-specif ic cDNA: 

The cDNA of the codA gene (ACCESSION S56903) may be amplified us- 
25 ing a sense-specific primer (codAS' C -term SEQ ID NO: 69 ) ^d an 
antisense-specific primer (codA3 'C-term SEQ ID NO: 70). The PCR 
conditions to be chosen are as follows: 

The PCR was carried out in 50 ul reaction mixture which con- 
30 tained: 

2 ul (200 ng) of cDNA from A.thaliana -[codA] or A.thaliana 
[ codA) - [ codA— RNAi-At .Act . -2 -At . Als-R-ocsT] plants 
0.2 mM dATP, dTTP, dGTP, dCTP 

35 _ 1.5 mM Mg(OAc)2 

- 5 ^ of bovine serum albumin 

40 pmol of codA5'C-term SEQ ID NOs 69 
40 pmol of codA3'C-term SEQ ID NO: 70 

40 - 15 ul of 3.3x rTth DNA Polymerase XLPuffer (PE Applied Bio- 
systems ) 

5U of rTth DNA polymerase XL (PE Applied Biosystems) 

The PCR was carried out under the following cycle conditions: 
45 Step 1: 5 minutes 94°C (denaturation) 
Step 2: 3 seconds 94°C 
Step 3: 1 minute 56°C (annealing) 
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Step 4: 2 minutes 7 2°C (elongation) 
30 repeats of steps 2 to 4 

Step 5 2 10 minutes 72°C (post elongation) 
Step 6: 4 C C (waiting loop) 

5 

In the positively selected plants, the steady-state amount of the 
mRNA of the codA gene and the amount of CODA protein resulting 
therefrom is reduced so much that a quantitative conversion of 
5-f luorocytosine to 5-f luorouracil can no longer occur. Conse- 
10 quently, these plants (in contrast to the untransf ormed plants) 
can grow in the presence of 5-f luorocytosine . Thus it is demon- 
strated that transgenic plants can be identified owing to the ap- 
plied principle of preventing expression of a negative selection 
marker . 



15 



Example 9: Detection of the DNA coding for codA-RNAi by using 

genomic DNA of the positively selected double trans- 
genic plants (A.thaliana [ codA] - [codA-RNAi - 
At.Act.-2-At -Als-R-ocsT] ) 



20 



The codA-RNAi transgene may be amplified using a codA-specif ic 
primer (e.g. codAS ' Hindlll SEQ ID NO: 50) and a 35S terminator- 
specific primer (35sT 5' Primer SEQ ID NO: 71). Using this primer 
combination, it is possible to detect specifically only the DNA 
25 coding for the codA RNAi construct , since the codA gene which was 
already present in the starting plants (A.thaliana [codA]) used 
for transformation is flanked by the nos terminator. 



30 



The PCR conditions to be chosen are as follows: 

The PCR was carried out in a 50 yxl reaction mixture which con- 
tains: 



2 \xl (200ng) of genomic DNA from the A.thaliana [codA] -[ codA- 
RNAi -At .Act . -2 -At . Als-R-ocsT ] plants 

35 - 0.2 mM dATP, dTTP, dGTP r dCTP 

1.5 mM Mg(OAc) 2 
- 5 \ig of bovine serum albumin 

40 pmol of codA-specif ic sense primer (SEQ ID NO: 50, 53 or 
40 69) 

4 0 pmol of 3 5sT 5' primer SEQ ID NO: 71 

15 n-1 of 3.3x rTth DNA Polymerase XLPuffer (PE Applied Bio- 
systems) 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 



45 



The PCR was carried out under the following cycle conditions: 
Step It 5 minutes 94°C < denaturation) 
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Step 2: 3 seconds 94°C 
Step 3: 1 minute 56°C (annealing) 
Step 4: 2 minutes 72°C (elongation) 
30 repeats of steps 2 to 4 
5 Step 5: 10 minutes 72°C (post elongation) 
Step 6: 4°C (waiting loop) 

in this way, it is possible to detect in the positively selected 
plants integration of the codA-RNAi DNA construct into the chro- 
in mosomal DKA of the starting plants used for transformation. Thus 
it is demonstrated that transgenic plants can be identified owing 
to the applied principle of preventing expression of a negative 
selection marker. 

15 Example 10: Detection of the reduction in the steady-state amount 

of codA UNA in the positively selected double trans- 
genic plants (A.thaliana [codA)-[codA-RNAi- 
At.Act.-2-At.Als-R-ocsT]) in comparison with the 
starting plants (A.thaliana (codA]) used for trans- 

20 formation, by Northern analysis. 



Gel-electrophoretic RNA fractionation: 

25 For each RNA agarose gel, 3 g of agar are dissolved in 150^ ml of 
H 2 0 (f.c. 1.5% (w/v)) in a microwave oven and cooled to 60°C. The 
addition of 20 ml of lOx MEN (0.2 M MOPS, 50 mM sodium acetate, 
10 mM EDTA) and 30 ml of formaldehyde (f.c. 2.2 M) causes further 
cooling so that the well-mixed solution must be poured speedily. 

30 Formaldehyde prevents the formation of secondary structures in 
the RNA, and therefore the rate of migration is approximately 
proportional to the molecular weight (LEHRBACH H et al. (1977) 
Biochem J 16: 4743-4751). The RNA samples are denatured, prior to 
application to the gel, in the following mixture: 2 0 ul of RNA 

35 (1-2 ng/ul), 5 ul of lOx MEN buffer, 6 ul of formaldehyde, 20 ul 
of f ormamide . 

The mixture is mixed and incubated at 65°C for 10 minutes. 1/10 
volume of sample buffer and 1 ul of ethidium bromide (10 mg/ml) 
are added and the sample is then applied. Gel electrophoresis is 
carried out in horizontal gels in lx MEN at 120 V for two to 
three hours. After electrophoresis, the gel is photographed under 
UV light with the aid of a ruler for subsequent determination of 
the fragment length. This is followed by blotting the RNA to a 
nylon membrane according to the information in: SAMBROOK J et al. 
Molecular cloning: A laboratory manual. Cold Spring Harbor, New 
York, Cold Spring Harbor Laboratory Press, 19B9. 



40 



45 



PF 53790 



CA 02493364 2005-01-21 



87 

Radioactive labeling of DNA fragments and Northern hybridization 

The codA cDNA fragment (codARNAi- sense SEQ ID No: 49) can be la- 
5 beled using, for example, the High Prime kit sold by Roche Diag- 
nostics. The High Prime kit is based on the "random primed" meth- 
od for DNA labeling originally described by Feinberg and 
Vogelstein. Labeling is carried out by denaturing approx. 25 ng 
of DNA in 9-11 \il of H2O at 95°C for 10 min. After a short incuba- 
tion on ice, 4 \il of High Prime solution (contains a random prim- 
er mixture, 4 units of Klenow polymerase and 0.125 mM dATP, dTTP 
and dGTP each in a reaction buffer containing 50% glycerol) and 
3-5 ^1 of [a32P]dCTP (30-50 \iCx) are added. The reaction mixture 
is incubated at 37°C for at least 10 min and the unincorporated 

15 

dCTP is then separated from the now radiolabeled DNA by means of 
gel filtration via a Sephadex G-50 column. The fragment is subse- 
quently denatured at 95°C for 10 min and kept on ice until used. 
The following hybridization and preincubation buffers are used: 

20 

Hypo Hybond 

250 mM sodium phosphate buffer pH 7.2 
1 mM EDTA 
7% SDS (g/v) 
25 250 mM NaCl 

10 [ig /ml ssDNA 

5% polyethylene glycol (PEG) 6000 
40% forrnamide 



30 The hybridization temperature when using Hypo Hybond is 42°C and 
the duration of hybridization is 16-24 h. The RNA filters are 
washed using three different solutions! 2 x SSC (300 mM NaCl; 
30 mM sodium citrate) +0.1% SDS, 1 x SSC + 0,1% SDS and 0.1 x 
SSC + 0.1% SDS. The duration and intensity of washing depend on 

35 the strength of the activity bond. After washing, the filters are 
sealed in plastic foil and an X-ray film (X-OMat, Kodak) is ex- 
posed overnight at -70°C. The signal strength on the X-ray films 
is a measure of the amount of codA mRNA molecules in the total 
RNA bound on the membranes. Thus it is possible to detect the re- 

40 duct ion in codA mRNA in the positively selected plants compared 
to the starting plants used for transformation. 

In the positively selected plants, the steady-state amount of the 
mRNA of the codA gene and the amount of CODA protein produced re- 
45 suiting therefrom is reduced so much that a quantitative conver- 
sion of 5-f luorocytosine to 5-f luorouracil can no longer occur. 
Consequently, these plants (in contrast to the untransf ormed 
plants) can grow in the presence of 5-f luorocytosine . Thus it is 



PF 53790 



CA 02493364 2005-01-21 



88 

demonstrated that transgenic plants can be identified owing to 
the applied principle of preventing expression of a negative 
selection marker. 

5 Example 11: Summary of the results of "negative-negative" 

selection 

Transformation of the codA-transgenic Arabidopsis plants with the 
codA-dsRNA construct (pSUNl-codA-RNAi-At • Act . -2-At .Als-R-ocsT; 

10 SEQ ID NO: 57) results in a significantly increased number of 

double transgenic plants into whose genome the RNAi construct has 
been successfully integrated, in the case of both single selec- 
tion (with 5-f luorocytosine alone) and dual selection (Pursuit™ 
and 5-f luorocytosine) (in each case in comparison with untrans- 

15 formed plants). The analysis by means of PCR (see above) confirms 
the double transgenic state for the majority of the plants gener- 
ated in this way- This successfully demonstrates the practicabil- 
ity of the present invention, i.e. the usability of repression of 
a negative marker for positive selection (more or less a "nega- 

20 

tive-negative" selection) . 



25 



30 



35 



40 



45 



CA 02493364 2005-01-21 



PF 53790 



SEQUENCE LISTING 

<110> BASF Plant Science GmbH 

<120> Novel selection processes 

<130> PF53790-AT 

<140> 
<141> 

<160> 71 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 1284 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1)..(1281) 

<223> coding for cytosine deaminase (codA) 
<400> 1 

gtg teg aat aac get tta caa aca att att aac gec egg tta cca ggc 48 
Val Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 
15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gee 96 
Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys Xle Ser Ala 

20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 
lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 
35 40 45 

gec gaa caa ggt tta gtt ata ccg ccg ttt gtg gag cca cat att cac 192 
Ala Glu Gin Gly Leu Val He Pro Pro Phe Val Glu Pro His He His 
50 55 60 

ctg gac acc acg caa acc gec gga caa ccg aac tgg aat cag tec ggc 240 
Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

acg ctg ttt gaa ggc att gaa cgc tgg gec gag cgc aaa gcg tta tta 288 
Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

acc cat gac gat gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 
Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

att gec aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 
He Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

» 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 4 32 
Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gec ttc cct cag gaa ggg att 4 80 
Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 
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528 



576 



624 



672 



912 



ttg teg tat ccc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 
Leu Ser Tyr Pro Asn Gly Gin Ala lieu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 
Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gec ctg gcg caa aaa tac 
Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 
Asp Arg Leu lie Asp Val His Cys Asp Glu lie Asp Asp Glu Gin Ser 

210 215 220 

cgc ttt gtc gaa acc gtt get gee ctg gcg cac cat gaa ggc atg ggc 720 
Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

gcg cga gtc acc gec age cac acc acg gca atg cac tec tat aac ggg 768 
Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 
Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 

260 265 270 

ttt gtc gec aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 864 
Phe Val Ala Asn Pro Leu Val Asn lie His Leu Gin Gly Arg Phe Asp 

275 280 285 

acg tat cca aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 
Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 

290 295 300 

tec ggc att aac gtc tgc ttt ggt cac gat gat gtc ttc gat ccg tgg 
Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 
Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

cat gtt tgc cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 
His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 

340 345 350 

tta ate acc cac cac age gca agg acg ttg aat ttg cag gat tac ggc 
Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 

att gee gee gga aac age gec aac ctg att ate ctg ccg get gaa aat 
He Ala Ala Gly Asn Ser Ala Asn Leu He He Leu Pro Ala Glu Asn 

370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 
Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

ggc ggc aag gtg att gec age aca caa ccg gca caa acc acc gta tat 
Glv Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

ctg gag cag cca gaa gee ate gat tac aaa cgt tga 
Leu Glu Gin Pro Glu Ala He Asp Tyr Lys Arg 

420 425 
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1008 



1104 



1152 



1200 



124B 



1284 
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<210> 2 
<211> 427 
<212> PRT 

<213> Escherichia coli 
<400> 2 

Val Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 
15 10 15 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 
35 40 45 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His He His 
50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly He Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

lie Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

Ala Pro Trp lie Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

Gly Ala Asp Val Val Gly Ala He Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 



Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

Asp Arg Leu He Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 

260 265 270 

Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 
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His Val Cys Gin 

340 

Leu lie Thr His 
355 

lie Ala Ala Gly 
370 

Gly Phe Asp Ala 
385 

Gly Gly Lys Val 



Leu Glu Gin Pro 

420 



Leu Met Gly Tyr 



His Ser Ala Arg 

360 

Asn Ser Ala Asn 
375 

Leu Arg Arg Gin 
390 

lie Ala Ser Thr 
405 

Glu Ala lie Asp 



4 

Gly Gin lie Asn 
345 

Thr Leu Asn Leu 



Leu lie lie Leu 

380 

Val Pro Val Arg 
395 

Gin Pro Ala Gin 
410 

Tyr Lys Arg 
425 



Asp Gly Leu Asn 
350 

Gin Asp Tyr Gly 
365 

Pro Ala Glu Asn 



Tyr Ser Val Arg 

400 

Thr Thr Val Tyr 
415 



<210> 3 
<211> 1284 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<220> 

<221> misc_feature 
<222> (1) . . (3) 

<2 23> mutation of GTG to ATG start codon for expression 
in eukaryotic hosts 

<220> 

<2 21> CDS 

<222> (1)..(1281) 

<223> coding for cytosine deaminase (codA) 
<400> 3 

atg teg aat aac get tta caa aca att att aac gec egg tta cca ggc 48 
Met Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 
15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gee 96 
Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys He Ser Ala 

20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 
He Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 
35 40 45 

gee gaa caa ggt tta gtt ata ccg ccg ttt gtg gag cca cat att cac 192 
Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 
50 55 60 

ctg gac ace acg caa ace gee gga caa ccg aac tgg aat cag tec ggc 240 
Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

acg ctg ttt gaa ggc att gaa cgc tgg gec gag cgc aaa gcg tta tta 288 
Thr Leu Phe Glu Gly He Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 



» 
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acc cat gac gat gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 
Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

att gcc aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 
lie Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 432 
Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gcc ttc cct cag gaa ggg att 480 
Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly lie 
145 150 155 160 

ttg teg tat ccc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 52 8 
Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 576 
Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gcc ctg gcg caa aaa tac 624 
Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 672 
Asp Arg Leu lie Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 

cgc ttt gtc gaa acc gtt get gcc ctg gcg cac cat gaa ggc atg ggc 720 
Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

gcg cga gtc acc gcc age cac acc acg gca atg cac tec tat aac ggg 7 68 
Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 
Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 

260 265 270 

ttt gtc gcc aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 864 
Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

acg tat cca aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 912 
Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

tec ggc att aac gtc tgc ttt ggt cac gat gat gtc ttc gat ccg tgg 960 
Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 1008 
Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

cat gtt tgc cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 
His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 

340 345 350 

tta ate acc cac cac age gca agg acg ttg aat ttg cag gat tac ggc 1104 
Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 
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att gcc gcc gga aac age gec aac ctg att ate ctg ccg get gaa aat 1152 
lie Ala Ala Gly Asn Ser Ala Asn Leu lie lie Leu Pro Ala Glu Asn 
370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 1200 
Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

ggc ggc aag gtg att gee age aca caa ccg gea eaa ace acc gta tat 124 8 
Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

ctg gag cag cca gaa gcc ate gat tac aaa cgt tga 1284 
Leu Glu Gin Pro Glu Ala lie Asp Tyr Lys Arg 

420 425 

<210> 4 
<211> 427 
<212> PRT 

<213> Artificial sequence 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<400> 4 

Met Ser Asn Asn Ala Leu Gin Thr He lie Asn Ala Arg Leu Pro Gly 
15 10 15 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

lie Asp Ala Gin Ser Gly Val Met Pro He Thr Glu Asn Ser Leu Asp 
35 40 45 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 
50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu* 

85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

He Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

Gly Ala Asp Val Val Gly Ala He Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

Asp Arg Leu lie Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 
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Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 

260 265 270 

Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

Thr Tyr Pro Lys Arg Arg Gly lie Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 

340 345 350 

Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 

He Ala Ala Gly Asn Ser Ala Asn Leu He He Leu Pro Ala Glu Asn 
370 375 380 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

Leu Glu Gin Pro Glu Ala lie Asp Tyr Lys Arg 

420 425 



<210> 5 
<211> 1221 
<212> DNA 

<213> Streptomyces griseolus 

<220> 

<221> CDS 

<222> (1)..(1218) 

<223> coding for cytochrome P450-Sul (suaC) 

teg acg ccc cag acc acg gac gca ccc gec ttc 48 
*hr Thr Pro Gin Thr Thr Asp Ala Pro Ala Phe 

10 15 

;gt ccc tac cag tta ccg gac ggc tac gec cag 96 
!ys Pro Tyr Gin Leu Pro Asp Gly Tyr Ala Gin 

25 30 

jgc ccc ctg cac egg gtg acg etc tac gac ggc 144 
Sly Pro Leu His Arg Val Thr Leu Tyr Asp Gly 
40 45 

jtg acc aag cac gag gec gcg cgc aaa ctg etc 19 2 
/al Thr Lys His Glu Ala Ala Arg Lys Leu Leu 
50 55 60 



<400> 5 








atg 


acc 


gat 


acc 


gee 


Met 


Thr 


Asp 


Thr Ala 


1 








5 


ccg 


age 


aac 


egg 


age 


Pro 


Ser 


Asn 


Arg 


Ser 








20 




etc 


egg 


gac 


acc 


ccc 


Leu 


Arg 


Asp 


Thr 


Pro 






35 






cgt 


cag 


gcg 


tgg 


gtg 


Arg 


Gin 


Ala 


Trp 


Val 
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age gac ccc egg ctg tec tec aac egg acg gac gac aac ttc ccc gec 
lly Asp Pro Arg Leu Ser Ser Asn Arg Thr Asp Asp Asn Phe Pro Ala 



240 



288 



336 



384 



432 



65 70 
acg tea ccg cgc ttc gag gee gtc egg gag age ccg cag gcg ttc ate 
Thr Ser Pro Arg Phe Glu Ala Val Arg Glu Ser Pro Gin Ala Phe He 

85 9° 95 

qqc ctg gac ccg ccc gag cac ggc ace egg egg egg atg acg ate age 
III Leu Asp Pro Pro 111 His Gly Thr Arg Arg Arg Met Thr He Ser 

100 10S 

gag ttc acc gtc aag egg ate aag ggc atg cgc ccc gag gtc gag gag 
111 Phe Thr Val Lys Arg lie Lys Gly Met Arg Pro Glu Val Glu Glu 

!X5 120 125 

gtg gtg cac ggc ttc etc gac gag atg ctg gec gee ggc ccg acc gee 
Val Val His Gly Phe Leu Asp Glu Met Leu Ala Ala Gly Pro Thr Ala 

130 135 140 

qac ctg gtc agt cag ttc gcg ctg ccg gtg ccc tec atg gtg ate tgc 
Asp III Val sir Gin Phe Ala Leu Pro Val Pro Ser Met Val He Cys 
145 150 155 

caa etc etc ggc gtg ccc tac gec gac cac gag ttc ttc cag gac gcg 
Arg III Leu Gly Val Pro Tyr Ala Asp His Glu Phe Phe Gin Asp Ala 

165 170 
age aag egg ctg gtg cag tec acg gac gcg cag age gcg etc ace gcg 576 
Ser Lys Arg Leu Val Gin Ser Thr Asp Ala Gin Ser Ala Leu Thr Ala 

180 185 190 

egg aac gac etc gcg ggt tac ctg gac ggc etc ate acc cag ttc cag 
aH Asn Lp Leu Ala Gly Tyr Leu Asp Gly Leu He Thr Gin Phe Gin 

195 200 205 

acc gaa ccg ggc gcg ggc ctg gtg ggc get ctg gtc gec gac cag ctg 
Thr 111 Pro Gly 111 Gly Leu Val Gly Ala Leu Val Ala Asp Gin Leu 

210 215 220 

acc aac cgc gag ate gac cgt gag gaa ctg ate tec acc gcg atg ctg 
Ala Asn III 111 He Asp Arg Glu Glu Leu lie Ser Thr Ala Met Leu 
225 230 "5 2 «° 



etc etc ate gee ggc cac gag acc acg gec teg atg acc tec etc age 
lei Leu lie Ala Gly His Glu Thr Thr Ala Ser Met Thr Ser Leu Ser 

245 250 2 55 

ota ate ace ctg ctg gac cac ccc gag cag tac gec gec ctg cgc gec 
Va? Ill Thr Leu Leu Asp His Pro Glu Gin Tyr Ala Ala Leu Arg Ala 

260 265 270 

aac cgc age etc gtg ccc ggc gcg gtg gag gaa ctg etc cgc tac etc 
Asp Arg Ser Leu Va! Pro Gly Ala Val Glu Glu Leu Leu Arg Tyr Leu 

275 280 
gec ate gec gac ate gcg ggc ggc cgc gtc gec acg gcg gac ate gag 
La lie Ala Asp He Ala Gly Gly Arg Val Ala Thr Ala Asp He Glu 

290 2 9 5 300 

ate aag ggg cac etc ate egg gec ggc gag ggc gtg ate gtc gtc aac 
Val III HI His Leu He Arg Ala Gly Glu Gly Val He Val Val Asn 
305 310 315 

teg ata gec aac egg gac ggc acg gtg tac gag gac ccg gac gee etc 
Ser He La Asn Arg Asp Gly Thr Val Tyr Glu Asp Pro Asp Ala Leu 

325 330 



480 



528 



624 



672 



720 



76B 



816 



864 



912 



960 



1008 
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gac ate cac cgc tec gcg cgc cac cac etc gec ttc ggc ttc ggc gtg 1056 
Asp He His Arg Ser Ala Arg His His Leu Ala Phe Gly Phe Gly Val 

340 345 350 

cac cag tgc ctg ggc cag aac etc gec egg ctg gag ctg gag gtc ate 1104 
His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Glu Val He 
355 360 365 

< 

etc aac gec etc atg gac cgc gtc ccg acg ctg cga ctg gee gtc ccc 1152 

Leu Asn Ala Leu Met Asp Arg Val Pro Thr Leu Arg Leu Ala Val Pro 
370 375 380 

gtc gag cag ttg gtg ctg egg ccg ggt acg acg ate cag ggc gtc aac 1200 
Val Glu Gin Leu Val Leu Arg Pro Gly Thr Thr lie Gin Gly Val Asn 
385 390 395 400 

gaa etc ccg gtc ace tgg tga 1221 
Glu Leu Pro Val Thr Trp 

405 

<210> 6 
<211> 406 
<212> PRT 

<213> Streptomyces griseolus 
<400> 6 

Met Thr Asp Thr Ala Thr Thr Pro Gin Thr Thr Asp Ala Pro Ala Phe 
15 10 15 

Pro Ser Asn Arg Ser Cys Pro Tyr Gin Leu Pro Asp Gly Tyr Ala Gin 

20 25 30 

Leu Arg Asp Thr Pro Gly Pro Leu His Arg Val Thr Leu Tyr Asp Gly 
35 40 45 

Arg Gin Ala Trp Val Val Thr Lys His Glu Ala Ala Arg Lys Leu Leu 
50 55 60 

Gly Asp Pro Arg Leu Ser Ser Asn Arg Thr Asp Asp Asn Phe Pro Ala 
65 70 75 80 

Thr Ser Pro Arg Phe Glu Ala Val Arg Glu Ser Pro Gin Ala Phe lie 

85 90 95 

Gly Leu Asp Pro Pro Glu His Gly Thr Arg Arg Arg Met Thr He Ser 

100 105 110 

Glu Phe Thr Val Lys Arg He Lys Gly Met Arg Pro Glu Val Glu Glu 
115 120 125 

Val Val His Gly Phe Leu Asp Glu Met Leu Ala Ala Gly Pro Thr Ala 
130 135 140 

Asp Leu Val Ser Gin Phe Ala Leu Pro Val Pro Ser Met Val lie Cys 
145 150 155 160 

Arg Leu Leu Gly Val Pro Tyr Ala Asp His Glu Phe Phe Gin Asp Ala 

165 170 175 

Ser Lys Arg Leu Val Gin Ser Thr Asp Ala Gin Ser Ala Leu Thr Ala 

180 185 190 

Arg Asn Asp Leu Ala Gly Tyr Leu Asp Gly Leu lie Thr Gin Phe Gin 
195 200 205 

Thr Glu Pro Gly Ala Gly Leu Val Gly Ala Leu Val Ala Asp Gin Leu 
210 215 220 
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Ala Asn Gly Glu 
225 

Leu Leu He Ala 



Val He Thr Leu 

260 

Asp Arg Ser Leu 
275 

Ala He Ala Asp 
290 

Val Glu Gly His 
305 

Ser lie Ala Asn 



Asp lie His Arg 

340 

Bis Gin Cys Leu 
355 

Leu Asn Ala Leu 
370 

Val Glu Gin Leu 
3B5 

Glu Leu Pro Val 



He Asp Arg Glu 
230 

Gly His Glu Thr 
245 

Leu Asp His Pro 

Val Pro Gly Ala 

280 

He Ala Gly Gly 
295 

Leu He Arg Ala 
310 

Arg Asp Gly Thr 
325 

Ser Ala Arg His 



Gly Gin Asn Leu 

360 

Met Asp Arg Val 
375 

Val Leu Arg Pro 
390 

Thr Trp 
405 



10 

Glu Leu He Ser 
235 

Thr Ala Ser Met 
250 

Glu Gin Tyr Ala 
265 

Val Glu Glu Leu 

Arg Val Ala Thr 

300 

Gly Glu Gly Val 
315 

Val Tyr Glu Asp 
330 

His Leu Ala Phe 
345 

Ala Arg Leu Glu 

Pro Thr Leu Arg 

380 

Gly Thr Thr He 
395 



Thr Ala Met Leu 

240 

Thr Ser Leu Ser 
255 

Ala Leu Arg Ala 
270 

Leu Arg Tyr Leu 
285 

Ala Asp He Glu 

He Val Val Asn 

320 

Pro Asp Ala Leu 
335 

Gly Phe Gly Val 
350 

Leu Glu Val He 
365 

Leu Ala Val Pro 



Gin Gly Val Asn 

400 



<210> 7 
<211> 1404 
<212> DNA 

<213> Agrobacterium tumef aciens 

<220> 

<221> CDS 

<222> (1)..<1401) 

<223> coding for indole acetamide hydrolase (tms2) 
<400> 7 

atg gtg ccc att acc teg tta gca caa acc eta gaa cgc ctg aga egg 4 8 
Met Val Pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 15 

aaa gac tac tec tgc tta gaa eta gta gaa act ctg ata gcg cgt tgc 96 
Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 25 30 

caa get gca aaa cca tta aat gee ctt ctg get aca gac tgg gat ggc 144 
Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 45 

ttg egg cga age gee aaa aaa att gat cgt cat gga aac gee gga tta 192 
Leu Arg Arg Ser Ala Lys Lys He Asp Arg His Gly Asn Ala Gly Leu 
50 55 60 

ggt ctt tgc ggc att cca etc tgt ttt aag gcg aac ate gcg acc ggc 24 0 
Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 ~ 70 75 80 
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ata ttt cct aca age get get act ccg gcg ctg ata aac cac ttg cca 2 88 
lie Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu lie Asn His Leu Pro 

85 90 95 

aag ata cca tec cgc gtc gca gaa aga ctt ttt tea get gga gca ctg 336 
Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

100 105 110 

ccg ggt gec teg gga aac atg cat gag tta teg ttt gga att acg age 384 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

aac aac tat gee ace ggt gcg gtg egg aac ccg tgg aat cca agt ctg 4 32 
Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

ata cca gga ggc tea age ggt ggt gtg get get gcg gtg gca age cga 480 
He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

ttg atg tta ggc ggc ata ggc ace gat ace ggt gca tct gtt cgc eta 528 
Leu Met Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

ccc gca gee ctg tgt ggc gta gta gga ttt cga ccg acg ctt get cga 576 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 

180 185 190 

tat cca aga gat egg ata ata ccg gtc age ccc ace egg gac acc gee 624 
Tyr Pro Arg Asp Arg He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

gga ate ata gcg cag tgc gta gee gat gtt ata ate etc gac cag gtg 672 
Gly He He Ala Gin Cys Val Aia Asp Val lie He Leu Asp Gin Val 
210 215 220 

att tec gga egg teg gcg aaa att tea ccc atg ccg ctg aag ggg ctt 720 
He Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

egg ate ggc etc ccc act acc tac ttt tac gat gac ctt gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 255 

gtg gee ttc gca get gaa acg acg att cgc ttg eta gee aac aga ggc 816 
Val Ala Phe Ala Ala Glu Thr Thr lie Arg Leu Leu Ala Asn Arg Gly 

260 265 270 

gta acc ttt gtt gaa gec gac ate ccc cac eta gag gaa ctg aat agt 864 
Val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 
275 280 2B5 

ggg gca agt ttg cca att gcg ctt tac gaa ttt cca cac get eta aaa 912 
Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

aag tat etc gac gat ttt gtg gga aca gtt tct ttt tct gac gtt ate 960 
Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

aaa gga att cgt age ccc gat gta gcg aac att gtc agt gcg caa att 1008 
Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 

325 330 335 



gat ggg cat caa att tec aac gat gaa tat gaa ctg gcg cgt caa tec 
Asp Gly His Gin lie Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 

340 345 350 



1056 
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ttc agg cca agg etc cag gec act tat egg aat tac ttc aga etc tat 1104 
Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

cag tta gat gca ate ctt ttc cca act gca ccc tta gcg gec aaa gec 1152 
Gin Leu Asp Ala lie Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 

ata ggt cag gag teg tea gtc ate cac aat ggc tea atg atg aac act 1200 
He Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

ttc aag ate tac gtg cga aat gtg gac cca age age aac gca ggc eta 124 8 
Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

cct ggg ttg age ctt cct gee tgc ctt aca cct gat cgc ttg cct gtt 1296 
Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 

420 425 430 

gga atg gaa att gat gga tta gcg ggg tea gac cac cgt ctg tta gca 134 4 
Gly Met Glu lie Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

ate ggg gca gca tta gaa aaa gec ata aat ttt cct tec ttt ccc gat 1392 
He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Pro Ser Phe Pro Asp 
450 455 460 



get ttt aat tag 
Ala Phe Asn 
465 

<210> 8 
<211> 4 67 
<212> PRT 

<213> Agrobacterium tumefaciens 
<400> 8 

Met Val Pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 15 

Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 25 30 

Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

40 45 



Leu Arg Arg Ser Ala Lys Lys He Asp Arg His Gly Asn Ala Gly Leu 
50 55 60 

Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

He Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 

85 90 95 

Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

100 105 HO 

Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 



1404 
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Leu Met Leu Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 

180 185 190 

Tyr Pro Arg Asp Arg lie He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly He He Ala Gin Cys Val Ala Asp Val lie He Leu Asp Gin Val 
210 215 220 

He Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 



Val Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 

260 265 270 

Val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 

325 330 335 

Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 

340 345 350 

Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 

He Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 

420 425 430 

Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Pro Ser Phe Pro Asp 
450 455 460 

Ala Phe Asn 
465 



<210> 9 
<211> 1404 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 

<2 21> CDS 
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48 



96 



144 



192 



240 



288 
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14 

<222> (1)..(1401) 

<223> coding for indoleacetamide hydrolase (tms2) 
<400> 9 

atg gtg ccc att acc teg tta gca caa acc eta gaa cgc ctg aga egg 
Met Val Pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 15 

aaa gac tac tec tgc tta gaa eta gta gaa act ctg ata gcg cgt tgc 
Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 25 30 

caa get gca aaa cca tta aat gcc ctt ctg get aca gac tgg gat ggc 
Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 45 

ttg egg cga age gcc aaa aaa att gat cgt cat gga aac gcc gga tta 
Leu Arg Arg Ser Ala Lys Lys lie Asp Arg His Gly Asn Ala Gly Leu 
50 55 60 

ggt ctt tgc ggc att cca etc tgt ttt aag gcg aac ate gcg acc ggc 
Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

ata ttt cct aca age get get act ccg gcg ctg ata aac cac ttg cca 
He Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 

85 90 95 

aag ata cca tec cgc gtc gca gaa aga ctt ttt tea get gga gca ctg 336 
Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

100 105 HO 

ccg ggt gcc teg gga aac atg cat gag tta teg ttt gga att acg age 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

aac aac tat gcc acc ggt gcg gtg egg aac ccg tgg aat cca agt ctg 
Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

ata cca gga ggc tea age ggt ggt gtg get get gcg gtg gca age cga 
lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

ttg atg tta ggc ggc ata ggc acc gat acc ggt gca tct gtt cgc eta 
Leu Met Leu Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

ccc gca gcc ctg tgt ggc gta gta gga ttt cga ccg acg ctt get cga 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 

180 185 190 

tat cca aga gat egg ata ata ccg gtc age ccc acc egg gac acc gcc 
Tyr Pro Arg Asp Arg He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

gga ate ata gcg cag tgc gta gcc gat gtt ata ate etc gat cag gtg 
Gly He He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 

210 215 220 

att tec gga egg teg gcg aaa att tea ccc atg ccg ctg aag ggg ctt 
He Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

egg ate ggc etc ccc act acc tac ttt tac gat gac ctt gat get gat 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 255 



384 



432 



480 



528 



576 



624 



672 



720 



768 



* 
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gtg gcc ttc gca get gaa acg acg 
Val Ala Phe Ala Ala Glu Thr Thr 

260 

gta acc ttt gtt gaa gcc gac ate 
Val Thr Phe Val Glu Ala Asp He 
275 280 

ggg gca agt ttg cca att gcg ctt 
Gly Ala Ser Leu Pro lie Ala Leu 
290 295 

aag tat etc gac gat ttt gtg gga 
Lys Tyr Leu Asp Asp Phe Val Gly 
305 310 

aaa gga att cgt age ccc gat gta 
Lys Gly lie Arg Ser Pro Asp Val 

325 

gat ggg cat caa att tec aac gat 
Asp Gly His Gin He Ser Asn Asp 

340 

ttc agg cca agg etc cag gcc act 
Phe Arg Pro Arg Leu Gin Ala Thr 
355 360 



15 

att cgc ttg eta 
He Arg Leu Leu 
265 

ccc cac eta gag 
Pro His Leu Glu 



cag tta gat gca ate ctt ttc cca 
Gin Leu Asp Ala He Leu Phe Fro 
370 375 

ata ggt cag gag teg tea gtc ate 
He Gly Gin Glu Ser Ser Val He 
385 390 

ttc aag ate tac gtg cga aat gtg 
Phe Lys He Tyr Val Arg Asn Val 

405 

cct ggg ttg age ctt cct gcc tgc 
Pro Gly Leu Ser Leu Pro Ala Cys 

420 

gga atg gaa att gac gga tta gcg 
Gly Met Glu He Asp Gly Leu Ala 
435 440 

ate ggg gca gca tta gaa aaa gee 
He Gly Ala Ala Leu Glu Lys Ala 
450 455 



get 
Ala 
465 



tac gaa ttt cca 
Tyr Glu Phe Pro 

300 

aca gtt tct ttt 
Thr Val Ser Phe 
315 

gcg aac att gtc 
Ala Asn He Val 
330 

gaa tat gaa etg 
Glu Tyr Glu Leu 
345 

tat egg aat tac 
Tyr Arg Asn Tyr 

act gca ccc tta 
Thr Ala Pro Leu 

380 

cac aat ggc tea 
His Asn Gly Ser 
395 

gac cca age age 
Asp Pro Ser Ser 
410 

ctt aca cct gat 
Leu Thr Pro Asp 
425 

ggg tea gac cac 
Gly Ser Asp His 

ata aat ttt cct 
He Asn Phe Pro 

460 



gcc aac aga ggc 
Ala Asn Arg Gly 
270 

gaa etg aat agt 
Glu Leu Asn Ser 
285 

cac get eta aaa 
His Ala Leu Lys 

tct gac gtt ate 
Ser Asp Val He 

320 

agt gcg caa att 
Ser Ala Gin He 
335 

gcg cgt caa tec 
Ala Arg Gin Ser 
350 

ttc aga etc tat 
Phe Arg Leu Tyr 
365 

gcg gcc aaa gcc 
Ala Ala Lys Ala 

atg ata aac act 
Met He Asn Thr 

400 

aac gca ggc eta 
Asn Ala Gly Leu 
415 

cgc ttg cct gtt 
Arg Leu Pro Val 
430 

cgt etg tta gca 
Arg Leu Leu Ala 
445 

tec ttt ccc gat 
Ser Phe Pro Asp 



ttt 
Phe Asn 



<210> 10 
<211> 467 
<212> PRT 

<213> Agrobacterium tumef aciens 



<400> 10 
Met Val Pro 
1 



Thr 
5 



Leu Ala Gin Thr Leu Glu Arg Leu 

10 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1404 



15 
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Lys Asp Tyr Ser 

20 

Gin Ala Ala Lys 
35 

Leu Arg Arg Ser 
50 

Gly Leu Cys Gly 
65 

lie Phe Pro Thr 



Lys lie Pro Ser 

100 

Pro Gly Ala Ser 
115 

Asn Asn Tyr Ala 
130 

lie Pro Gly Gly 
145 

Leu Met Leu Gly 



Pro Ala Ala Leu 

180 

Tyr Pro Arg Asp 
195 

Gly lie He Ala 
210 

He Ser Gly Arg 
225 

Arg He Gly Leu 



Val Ala Phe Ala 

260 

Val Thr Phe Val 
275 

Gly Ala Ser Leu 
290 

Lys Tyr Leu Asp 
305 

Lys Gly lie Arg 



Asp Gly His Gin 

340 

Phe Arg Pro Arg 
355 

Gin Leu Asp Ala 
370 



Cys Leu Glu Leu 



Pro Leu Asn Ala 

40 

Ala Lys Lys lie 
55 

He Pro Leu Cys 
70 

Ser Ala Ala Thr 
85 

Arg Val Ala Glu 



Gly Asn Met His 

120 

Thr Gly Ala Val 
135 

Ser Ser Gly Gly 
150 

Gly He Gly Thr 
165 

Cys Gly Val Val 



Arg He He Pro 

200 

Gin Cys Val Ala 
215 

Ser Ala Lys He 
230 

Pro Thr Thr Tyr 
245 

Ala Glu Thr Thr 



Glu Ala Asp He 

280 

Pro He Ala Leu 

295 

Asp Phe Val Gly 
310 

Ser Pro Asp Val 
325 

He Ser Asn Asp 



Leu Gin Ala Thr 

360 

He Leu Phe Pro 
375 



16 

Val Glu Thr Leu 
25 

Leu Leu Ala Thr 



Asp Arg His Gly 

60 

Phe Lys Ala Asn 
75 

Pro Ala Leu He 
90 

Arg Leu Phe Ser 
105 

Glu Leu Ser Phe 



Arg Asn Pro Trp 

140 

Val Ala Ala Ala 
155 

Asp Thr Gly Ala 
170 

Gly Phe Arg Pro 
185 

Val Ser Pro Thr 



Asp Val He He 

220 

Ser Pro Met Pro 
235 

Phe Tyr Asp Asp 
250 

He Arg Leu Leu 
265 

Pro His Leu Glu 



Tyr Glu Phe Pro 

300 

Thr Val Ser Phe 
315 

Ala Asn He Val 
330 

Glu Tyr Glu Leu 
345 

Tyr Arg Asn Tyr 



Thr Ala Pro Leu 

3B0 



He Ala Arg Cys 
30 

Asp Trp Asp Gly 
45 

Asn Ala Gly Leu 



He Ala Thr Gly 

BO 

Asn His Leu Pro 
95 

Ala Gly Ala Leu 
110 

Gly He Thr Ser 
125 

Asn Pro Ser Leu 



Val Ala Ser Arg 

160 

Ser Val Arg Leu 
175 

Thr Leu Ala Arg 
190 

Arg Asp Thr Ala 
205 

Leu Asp Gin Val 



Leu Lys Gly Leu 

240 

Leu Asp Ala Asp 
255 

Ala Asn Arg Gly 
270 

Glu Leu Asn Ser 
285 

His Ala Leu Lys 



Ser Asp Val He 

320 

Ser Ala Gin He 
335 

Ala Arg Gin Ser 
350 

Phe Arg Leu Tyr 
365 

Ala Ala Lys Ala 



» 
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lie Gly Gin Glu 
385 

Phe Lys lie Tyr 



Pro Gly Leu Ser 

420 

Gly Met Glu lie 
435 

lie Gly Ala Ala 
450 

Ala Phe Asn 
465 



Ser Ser Val He 
390 

Val Arg Asn Val 
405 

Leu Pro Ala Cys 



Asp Gly Leu Ala 

440 

Leu Glu Lys Ala 
455 



17 

His Asn Gly Ser 
395 

Asp Pro Ser Ser 
410 

Leu Thr Pro Asp 
425 

Gly Ser Asp His 

He Asn Phe Pro 

460 



Met lie Asn Thr 

400 

Asn Ala Gly Leu 
415 

Arg Leu Pro Val 
430 

Arg Leu Leu Ala 
445 

Ser Phe Pro Asp 



<210> 11 
<211> 609 
<212> DNA 

<213> Xanthobacter autotrophicus 

<220> 

<221> CDS 

<222> (1)..(603) 

<2 23> coding for haloalkane dehalogenase 
<400> 11 

atg tea acg ttt ttt gaa ccg gag aac gga atg aaa caa aac gec aaa 48 
Met Ser Thr Phe Phe Glu Pro Glu Asn Gly Met Lys Gin Asn Ala Lys 
15 10 15 

acc gaa cga ate ctg gat gtc gcg etc gaa ttg ctt gag aca gag ggt 96 
Thr Glu Arg He Leu Asp Val Ala Leu Glu Leu Leu Glu Thr Glu Gly 

20 25 30 

gag ttt ggt ttg acg atg agg cag gtg gca acg caa gcg gac atg tec 144 
Glu Phe Gly Leu Thr Met Arg Gin Val Ala Thr Gin Ala Asp Met Ser 
35 40 45 

ctg age aac gtt cag tac tat ttc aag tec gag gac ctg etc etc gtg 192 
Leu Ser Asn Val Gin Tyr Tyr Phe Lys Ser Glu Asp Leu Leu Leu Val 
50 55 60 

gee atg gca gac cgt tac ttt caa egg tgc ctg aca acc atg get gag 24 0 
Ala Met Ala Asp Arg Tyr Phe Gin Arg Cys Leu Thr Thr Met Ala Glu 
65 70 75 80 

cat ccg ccc tta teg gca ggg cgt gat caa cac gec cag tta aga gcg 28 8 
His Pro Pro Leu Ser Ala Gly Arg Asp Gin His Ala Gin Leu Arg Ala 

85 90 95 

ttg tta cga gaa ctg etc ggt cat ggt ctt gag att tec gag atg tgt 336 
Leu Leu Arg Glu Leu Leu Gly His Gly Leu Glu He Ser Glu Met Cys 

100 105 110 

cga ata ttc agg gag tac tgg gca ate gec acc cgt aat gaa act gtt 384 
Arg He Phe Arg Glu Tyr Trp Ala He Ala Thr Arg Asn Glu Thr Val 
115 120 125 

cac ggc tat etc aag teg tac tat egg gat etc gee gaa gtg atg get 432 
His Gly Tyr Leu Lys Ser Tyr Tyr Axg Asp Leu Ala Glu Val Met Ala 
130 135 140 
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gag aag ctt gcg cca ctg gcc age age gaa aag gcg ctg gec gtg gec 
Glu Lys Leu Ala Pro Leu Ala Ser Ser Glu Lys Ala Leu Ala Val Ala 
145 150 155 160 

gta tct ttg gtt att cct tat gtt gag ggg tat teg gta acg gcc att 
Val Ser Leu Val He Pro Tyr Val Glu Gly Tyr Ser Val Thr Ala He 

165 170 175 

gca atg ccc gaa tec att gat acg att tec gag acg ctg acc aat gtg 
Ala Met Pro Glu Ser He Asp Thr He Ser Glu Thr Leu Thr Asn Val 

180 185 190 

gtg ttg gag cag ctt cgc ate age aat tcatga 
Val Leu Glu Gin Leu Arg He Ser Asn 
195 200 

<210> 12 
<211> 201 
<212> PRT 

<213> Xanthobacter autotrophicus 
<400> 12 

Met Ser Thr Phe Phe Glu Pro Glu Asn Gly Met Lys Gin Asn Ala Lys 
15 10 15 

Thr Glu Axg He Leu Asp Val Ala Leu Glu Leu Leu Glu Thr Glu Gly 

20 25 30 

Glu Phe Gly Leu Thr Met Arg Gin Val Ala Thr Gin Ala Asp Met Ser 

40 45 



Leu Ser Asn Val Gin Tyr Tyr Phe Lys Ser Glu Asp Leu Leu Leu Val 

50 55 60 

Ala Met Ala Asp Arg Tyr Phe Gin Arg Cys Leu Thr Thr Met Ala Glu 
65 70 75 80 

His Pro Pro Leu Ser Ala Gly Arg Asp Gin His Ala Gin Leu Arg Ala 

85 90 95 

Leu Leu Arg Glu Leu Leu Gly His Gly Leu Glu He Ser Glu Met Cys 

100 105 HO 

Arg He Phe Arg Glu Tyr Trp Ala He Ala Thr Arg Asn Glu Thr Val 
115 120 125 

His Gly Tyr Leu Lys Ser Tyr Tyr Arg Asp Leu Ala Glu Val Met Ala 

130 135 140 

Glu Lys Leu Ala Pro Leu Ala Ser Ser Glu Lys Ala Leu Ala Val Ala 
145 150 155 160 

Val Ser Leu Val He Pro Tyr Val Glu Gly Tyr Ser Val Thr Ala He 

165 170 175 

Ala Met Pro Glu Ser He Asp Thr He Ser Glu Thr Leu Thr Asn Val 

180 185 190 

Val Leu Glu Gin Leu Arg He Ser Asn 
195 200 



<210> 13 
<211> 1131 
<212> DNA 

<213> Herpes simplex virus 1 



480 



528 



576 



609 
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<220> 

<221> CDS 

<222> (1)..(1128) 

<223> coding for thymidine kinase (TK) 
<400> 13 

atg get teg tac ccc tgc cat caa cac gcg tct gcg ttc gac cag cfet 48 
Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

gcg cgt tct cgc ggc cat age aac cga cgt acg gcg ttg cgc cct cgc 96 
Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 

20 25 30 

egg cag caa gaa gec acg gaa gtc cgc ctg gag cag aaa atg ccc acg 14 4 
Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

eta ctg egg gtt tat ata gac ggt cct cac ggg atg ggg aaa acc acc 192 
Leu Leu Arg Val Tyr lie Asp Gly Pro His Gly Met Gly Lys Thr Thr 
50 55 60 

acc acg caa ctg ctg gtg gec ctg ggt teg cgc gac gat ate gtc tac 240 
Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp Xle Val Tyr 
65 70 75 80 

gta ccc gag ccg atg act tac tgg cag gtg ctg ggg get tec gag aca 288 
Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 

85 90 95 

ate gcg aac ate tac acc aca caa cac cgc etc gac cag ggt gag ata 336 
Xle Ala Asn lie Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu lie 

100 105 110 

teg gee ggg gac gcg gcg gtg gta atg aca age gec cag ata aca atg 384 
Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin He Thr Met 
115 120 125 

ggc atg cct tat gee gtg acc gac gee gtt ctg get cct cat gtc ggg 432 
Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 
130 135 140 

ggg gag get ggg agt tea cat gec ccg ccc ccg gec etc acc etc ate 480 
Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu He 
145 150 155 160 

ttc gac cgc cat ccc ate gec gee etc ctg tgc tac ccg gec gcg cga 528 
Phe Asp Arg His Pro lie Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 

165 170 175 

tac ctt atg ggc age atg acc ccc cag gec gtg ctg gcg ttc gtg gec 576 
Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 

180 185 190 

etc ate ccg ccg acc ttg ccc ggc aca aac ate gtg ttg ggg gec ctt 624 
Leu He Pro Pro Thr Leu Pro Gly Thr Asn lie Val Leu Gly Ala Leu 
195 200 205 



ccg gag gac aga cac ate gac cgc ctg gec aaa cgc cag cgc ccc ggc 
Pro Glu Asp Arg His He Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 
210 215 220 



672 



gag egg ctt gac ctg get atg ctg gee gcg att cgc cgc gtt tac ggg 72 0 
Glu Arg Leu Asp Leu Ala Met Leu Ala Ala He Arg Arg Val Tyr Gly 
225 230 235 240 
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ctg ctt gcc aat acg gtg egg tat ctg cag ggc ggc ggg teg tgg tgg 7 68 
Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 

245 250 255 

gag gat tgg gga cag ctt teg ggg acg gcc gtg ccg ccc cag ggt gcc 816 
Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 

260 265 270 

gag ccc cag age aac gcg ggc cca cga ccc cat ate ggg gac acg tta B64 
Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His lie Gly Asp Thr Leu 
275 280 285 

ttt acc ctg ttt egg gcc ccc gag ttg ctg gcc ccc aac ggc gac ctg 912 
Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 

tat aac gtg ttt gcc tgg gcc ttg gac gtc ttg gcc aaa cgc etc cgt 960 
Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

ccc atg cac gtc ttt ate ctg gat tac gac caa teg ccc gcc ggc tgc 1008 
Pro Met His Val Phe lie Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 

325 330 335 

egg gac gcc ctg ctg caa ctt acc tec ggg atg gtc cag acc cac gtc 1056 
Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 

340 345 350 

acc acc cca ggc tec ata ccg acg ate tgc gac ctg gcg cgc acg ttt 1104 
Thr Thr Pro Gly Ser lie Pro Thr lie Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

gcc egg gag atg ggg gag get aac tga 1131 
Ala Arg Glu Met Gly Glu Ala Asn 
370 375 

<210> 14 
<211> 376 
<212> PRT 

<213> Herpes simplex virus 1 
<400> 14 

Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 

20 25 30 

Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

Leu Leu Arg Val Tyr lie Asp Gly Pro His Gly Met Gly Lys Thr Thr 
50 55 60 

Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp lie Val Tyr 
65 70 75 80 

Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 

85 90 95 

He Ala Asn He Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu He 

100 105 110 

Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin lie Thr Met 
115 120 125 
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Gly Met Pro Tyr 
130 

Gly GXu Ala Gly 
145 

Phe Asp Arg His 



Tyr Leu Met Gly 

180 

Leu lie Pro Pro 
195 

Pro Glu Asp Arg 
210 

Glu Arg Leu Asp 
225 

Leu Leu Ala Asn 



Glu Asp Trp Gly 

260 

Glu Pro Gin Ser 
275 

Phe Thr Leu Phe 
290 

Tyr Asn val Phe 
305 

Pro Met His Val 



Arg Asp Ala Leu 

340 

Thr Thr Pro Gly 
355 

Ala Arg Glu Met 
370 



Ala Val Thr Asp 
135 

Ser Ser His Ala 
150 

Pro lie Ala Ala 
165 

Ser Met Thr Pro 



Thr Leu Pro Gly 

200 

His lie Asp Arg 
215 

Leu Ala Met Leu 
230 

Thr Val Arg Tyr 
245 

Gin Leu Ser Gly 

Asn Ala Gly Pro 

280 

Arg Ala Pro Glu 
295 

Ala Trp Ala Leu 
310 

Phe lie Leu Asp 
325 

Leu Gin Leu Thr 



Ser lie Pro Thr 

360 

Gly Glu Ala Asn 
375 



Ala Val Leu Ala 

140 

Pro Pro Pro Ala 
155 

Leu Leu Cys Tyr 
170 

Gin Ala Val Leu 
185 

Thr Asn lie Val 



Leu Ala Lys Arg 

220 

Ala Ala lie Arg 
235 

Leu Gin Gly Gly 
250 

Thr Ala Val Pro 
265 

Arg Pro His lie 



Leu Leu Ala Pro 

300 

Asp Val Leu Ala 
315 

Tyr Asp Gin Ser 
330 

Ser Gly Met Val 
345 

lie Cys Asp Leu 



Pro His Val Gly 

Leu Thr Leu lie 

160 

Pro Ala Ala Arg 
175 

Ala Phe Val Ala 
190 

Leu Gly Ala Leu 
205 

Gin Arg Pro Gly 



Arg Val Tyr Gly 

240 

Gly Ser Trp Trp 
255 

Pro Gin Gly Ala 
270 

Gly Asp Thr Leu 
285 

Asn Gly Asp Leu 



Lys Arg Leu Arg 

320 

Pro Ala Gly Cys 
335 

Gin Thr His Val 
350 

Ala Arg Thr Phe 
365 



<210> 15 
<211> 1131 
<212> DNA 

<213> Herpes simplex virus 1 

<220> 

<221> CDS 

<222> (1)..<1128) 

<223> coding for thymidine kinase (TK) 
<400> 15 

atg get teg tac ccc tgc cat caa cac gcg tct gcg ttc gac cag act 
Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

gcg cgt tct cgc ggc cat age aac cga cgt acg gcg ttg cgc cct cgc 
Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 

20 25 30 
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144 



240 



288 



336 



egg cag caa gaa gec acg gaa gtc cgc ctg gag cag aaa atg ccc acg 
Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 

35 40 45 

eta ctg egg gtt tat ata gac ggt cct cac ggg atg ggg aaa acc acc 192 
Leu Leu Arg Val Tyr He Asp Gly Pro His Gly Met Gly Lys Thr Thr 

50 55 60 

acc acg caa ctg ctg gtg gec ctg ggt teg cgc gac gat ate gtc tac 
Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp He Val Tyr 
65 70 75 80 

gta ccc gag ccg atg act tac tgg cag gtg ctg ggg get tec gag aca 
Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 

85 90 95 

ate gcg aac ate tac acc aca caa cac cgc etc gac cag ggt gag ata 
He Ala Asn He Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu He 

100 105 HO 

teg gee ggg gac gcg gcg gtg gta atg aca age gec cag ata aca atg 
Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin He Thr Met 

115 120 125 

ggc atg cct tat gee gtg acc gac gec gtt ctg get cct cat gtc ggg 
Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 

130 135 140 

ggg gag get ggg agt tea cat gec ccg ccc ccg gec etc acc etc ate 
Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu He 
145 150 155 160 

ttc gac cgc cat ccc ate gec gec etc ctg tgc tac ccg gec gcg cga 
Phe Asp Arg His Pro He Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 

165 170 175 

tac ctt atg ggc age atg acc ccc cag gee gtg ctg gcg ttc gtg gee 
Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 

1B0 185 190 

etc ate ccg ccg acc ttg ccc ggc aca aac ate gtg ttg ggg gec ctt 
Leu lie Pro Pro Thr Leu Pro Gly Thr Asn He Val Leu Gly Ala Leu 
195 200 205 



384 



432 



480 



528 



576 



624 



672 



720 



ccg gag gac aga cac ate gac cgc ctg gec aaa cgc cag cgc ccc ggc 
pro Glu Asp Arg His He Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 

210 215 220 

gag egg ctt gac ctg get atg ctg gee gcg att cgc cgc gtt tac ggg 
Glu Arg Leu Asp Leu Ala Met Leu Ala Ala lie Arg Arg Val Tyr Gly 
225 230 235 240 

ctg ctt gec aat acg gtg egg tat ctg cag ggc ggc ggg teg tgg tgg 768 
Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 

245 250 255 

gag gat tgg gga cag ctt teg ggg acg gec gtg ccg ccc cag ggt gee 
Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 

260 265 270 



816 



864 



gag ccc cag age aac gcg ggc cca cga ccc cat ate ggg gac acg tta 
Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His He Gly Asp Thr Leu 

275 280 285 

ttt acc ctg ttt egg gee ccc gag ttg ctg gee ccc aac ggc gac ctg 912 
Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 



»' 
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tat aac gtg ttt gcc tgg gcc ttg gac gtc ttg gcc aaa cgc etc cgt 960 
Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

ccc atg cac gtc ttt ate ctg gat tac gac caa teg ccc gcc ggc tgc 1008 
Pro Met His Val Phe lie Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 

325 330 335 

egg gac gcc ctg ctg caa ctt ace tec ggg atg gtc cag acc cac gtc 1056 
Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 

340 345 350 

ace acc cca ggc tec ata ccg acg ate tgc gac ctg gcg cgc acg ttt 1104 
Thr Thr Pro Gly Ser lie Pro Thr lie Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

gcc egg gag atg ggg gag get aac tga 1131 
Ala Arg Glu Met Gly Glu Ala Asn 
370 375 

<210> 16 
<211> 376 
<212> PRT 

<213> Herpes simplex virus 1 
<400> 16 

Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 

20 25 30 

Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

Leu Leu Arg Val Tyr lie Asp Gly Pro His Gly Met Gly Lys Thr Thr 
50 .55 60 

Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp lie Val Tyr 
65 70 75 80 

Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 

90 95 



lie Ala Asn lie Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu He 

100 105 110 

Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin lie Thr Met 
115 120 125 

Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 
130 135 140 

Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu lie 
145 150 155 160 

Phe Asp Arg His Pro He Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 

165 170 175 

Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 

180 1B5 190 

Leu He Pro Pro Thr Leu Pro Gly Thr Asn Tie Val Leu Gly Ala Leu 
195 200 205 

Pro Glu Asp Arg His He Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 
210 * 215 220 
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Glu Arg Leu Asp Leu Ala Met Leu Ala Ala He Arg Arg Val Tyr Gly 
225 230 235 240 

Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 

245 250 255 

Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 

260 265 270 

Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His He Gly Asp Thr Leu 
275 280 285 

Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 

Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

Pro Met His Val Phe He Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 

325 330 335 

Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 

340 345 350 

Thr Thr Pro Gly Ser He Pro Thr He Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

Ala Arg Glu Met Gly Glu Ala Asn 
370 375 



<210> 17 
<211> 840 
<212> DNA 

<213> Toxoplasma gondii 

<220> 

<221> CDS 

<222> (1).J(837) 

<223> coding for hypoxanthine -xanthine -guanine 
phosphoribosyl transferase (HXGPRTase) 

<400> 17 

atg gcg tec aaa ccc att gaa gaa tec egg teg caa aaa egg agt gec 

Met Ala Ser Lys Pro He Glu Glu Ser Arg Ser Gin Lys Arg Ser Ala 
15 10 15 

ttc tea gac ate ttc tgt tgt tgc act cct aat gaa ggg get ate gtg 
Phe Ser Asp He Phe Cys Cys Cys Thr Pro Asn Glu Gly Ala He Val 

20 25 30 

ccc agt gac cca atg gtc tec acc agt get cca gca cgc ace agt get 
Pro Ser Asp Pro Met Val Ser Thr Ser Ala Pro Ala Arg Thr Ser Ala 
35 40 45 

cca gcg cgc tec agt gca ctt caa gac tac ggc aag ggc aag ggc cgt 
Pro Ala Arg Ser Ser Ala Leu Gin Asp Tyr Gly Lys Gly Lys Gly Arg 
50 55 60 

att gag ccc atg tat ate ccc gac aac acc ttc tac aac get gat gac 
He Glu Pro Met Tyr He Pro Asp Asn Thr Phe Tyr Asn Ala Asp Asp 
65 70 75 80 

ttt ctt gtg ccc ccc cac tgc aag ccc tac att gac aaa ate etc etc 
Phe Leu Val Pro Pro His Cys Lys Pro Tyr He Asp Lys He Leu Leu 

85 90 95 



48 



96 



144 



192 



240 



288 
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cct 
Pro 


ggt 
Gly 


gga ttg 
Gly Leu 
100 


gtc 
Val 


aag 
Lys 


gac 
Asp 


aga 
Arg 


gtt 
Val 
105 


gag 
Glu 


aag 
Lys 


ttg 
Leu 


gcg 
Ala 


tat 
Tyr 
110 


gac 
Asp 


ate 
He 


336 


cac 
His 


aga 
Arg 


act 
Thr 

1 1 c 


tac 
Tyr 


ttc 
Phe 


ggc 
Gly 


gag 
Glu 


gag 
Glu 
120 


ttg 
Leu 


cac 
His 


ate 
He 


att 
He 


tgc 
Cys 
125 


ate 
He 


ctg 
Leu 


aaa 
Lys 


384 


ggc 
Gly 


tct 
Ser 
130 


cgc 
Arg 


ggc 
Gly 


ttc 
Phe 


ttc 
Phe 


aac 
Asn 
135 


ctt 
Leu 


ctg 
Leu 


ate 
He 


gac 
Asp 


tac 
Tyr 
140 


ctt 
Leu 


gee 
Ala 


acc 
Thr 


ata 

He 


432 


cag 
Gin 
145 


aag 
Lys 


tac 
Tyr 


agt 
Ser 


ggt cgt 
Gly Arg 
150 


gag 
Glu 


tec 
Ser 


age 
Ser 


gtg 
Val 


ccc 
Pro 
155 


ccc 
Pro 


ttc 
Phe 


ttc 
Phe 


gag 
Glu 


cac 
His 
160 


480 


tat 
Tyr 


gtc 
val 


cgc 
Arg 


ctg 
Leu 


aag 
Lys 
165 


tec 
Ser 


tac 
Tyr 


cag 
Gin 


aac 
Asn 


gac 
Asp 
170 


aac 
Asn 


age 
Ser 


aca 
Thr 


ggc 
Gly 


cag 
Gin 
175 


etc 
Leu 


528 


acc 
Thr 


gtc 
Val 


ttg 
Leu 


age 
Ser 

1 O rt 

1 BO 


gac 
Asp 


gac 
Asp 


ttg 
Leu 


tea 
Ser 


ate 
He 
185 


ttt 
Phe 


cgc 
Arg 


gac 
Asp 


aag 
Lys 


cac 
His 
190 


gtt 
Val 


ctg 
Leu 


576 


att 
He 


gtt 
Val 


gag 
Glu 
195 


gac 
Asp 


ate 
He 


gtc 
Val 


gac 
Asp 


acc 
Thr 
200 


ggt 
Gly 


ttc 
Phe 


acc 
Thr 


etc 
Leu 


acc 
Thr 
205 


gag 
Glu 


ttc 
Phe 


ggt 
Gly 


624 


gag 
Glu 


cgc 
Arg 
210 


ctg 
Leu 


aaa 

Lys 


gee 
Ala 


gtc 
Val 


ggt 
Gly 
215 


ccc 
Pro 


aag 
Lys 


teg 
Ser 


atg 
Met 


aga 
Arg 
220 


ate 
He 


gee 
Ala 


acc 
Thr 


etc 
Leu 


672 


gtc 
Val 
225 


gag 
Glu 


aag 
Lys 


cgc 
Arg 


aca 
Thr 


gat 
Asp 
230 


cgc 
Arg 


tec 
Ser 


aac 
Asn 


age 
Ser 


ttg 
Leu 
235 


aag 
Lys 


ggc 
Gly 


gac 

Asp 


ttc 
Phe 


gtc 
Val 
240 


720 


ggc 
Gly 


ttc 
Phe 


age 
Ser 


att 

He 


gaa 
Glu 
245 


gac 
Asp 


gtc 
Val 


tgg 

Trp 


ate 
He 


gtt 
val 
250 


ggt 
Gly 


tgc 
Cys 


tgc 
Cys 


tac 
Tyr 


gac 
Asp 
255 


ttc 
Phe 


768 


aac 
Asn 


gag 
Glu 


atg 
Met 


ttc 
Phe 
260 


cgc 
Arg 


gac 
Asp 


ttc 
Phe 


gac 
Asp 


cac 
His 
265 


gtc 
val 


gee 
Ala 


gtc 
Val 


ctg 
Leu 


age 
Ser 
270 


gac 
Asp 


gec 
Ala 


B16 


get 
Ala 


cgc 
Arg 


aaa 
Lys 
275 


aag 
Lys 


ttc 
Phe 


gag 
Glu 


aag 
Lys 


taa 


















840 


<210> IB 
<211> 279 
<212> PRT 

<213> Toxoplasma gondii 
























<400> 18 
Met Ala Ser 
1 


Lys 


Pro 
5 


He 


Glu 


Glu 


Ser 


Arg 
10 


Ser 


Gin 


Lys 


Arg 


Ser 
15 


Ala 




Phe 


Ser 


Asp 


He 
20 


Phe 


Cys 


Cys 


Cys 


Thr 
25 


Pro 


Asn 


Glu 


Gly 


Ala 
30 


He 


Val 




Pro 


Ser 


Asp 
35 


Pro 


Met 


Val 


Ser 


Thr 
40 


Ser 


Ala 


Pro 


Ala 


Arg 
45 


Thr 


Ser 


Ala 




Pro 


Ala 


Arg 


Ser 


Ser 


Ala 


Leu 


Gin 


Asp 


Tyr 


Gly 


Lys 


Gly 


Lys 


Gly 


Arg 





50 55 60 
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He Glu Pro Met 
65 

Phe Leu Val Pro 



Pro Gly Gly Leu 

100 

His Arg Thr Tyr 
115 

Gly Ser Arg Gly 
130 

Glu Lys Tyr Ser 
145 

Tyr Val Arg Leu 



Thr Val Leu Ser 

180 

He Val Glu Asp 
195 

Glu Arg Leu Lys 
210 

Val Glu Lys Arg 
225 

Gly Phe Ser He 



Asn Glu Met Phe 

260 

Ala Arg Lys Lys 
275 



Tyr He Pro Asp 
70 

Pro His Cys Lys 
85 

Val Lys Asp Arg 



Phe Gly Glu Glu 

120 

Phe Phe Asn Leu 
135 

Gly Arg Glu Ser 
150 

Lys Ser Tyr Gin 
165 

Asp Asp Leu Ser 



He Val Asp Thr 

200 

Ala Val Gly Pro 
215 

Thr Asp Arg Ser 
230 

Glu Asp Val Trp 
245 

Arg Asp Phe Asp 



Phe Glu Lys 



26 

Asn Thr Phe Tyr 
75 

Pro Tyr He Asp 
90 

Val Glu Lys Leu 
105 

Leu His lie He 



Leu He Asp Tyr 

140 

Ser Val Pro Pro 
155 

Asn Asp Asn Ser 
170 

Tie Phe Arg Asp 
185 

Gly Phe Thr Leu 



Lys Ser Met Arg 

220 

Asn Ser Leu Lys 
235 

He Val Gly Cys 
250 

His Val Ala Val 
265 



Asn Ala Asp Asp 

80 

Lys He Leu Leu 
95 

Ala Tyr Asp He 
110 

Cys He Leu Lys 
125 

Leu Ala Thr He 



Phe Phe Glu His 

160 

Thr Gly Gin Leu 
175 

Lys His Val Leu 
190 

Thr Glu Phe Gly 
205 

He Ala Thr Leu 



Gly Asp Phe Val 

240 

Cys Tyr Asp Phe 
255 

Leu Ser Asp Ala 
270 



<210> 19 
<211> 459 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1)..(456) 

<223> coding for xanthine-guanine phosphoribosyl 
transferase (gpt) 



<400> 19 
atg age 
Met Ser 
1 


gaa 
Glu 


aaa 
Lys 


tac 
Tyr 
5 


ate 
He 


gtc 
Val 


acc 

Thr 


tgg 
Trp 


gac 
Asp 
10 


atg 
Met 


ttg 
Leu 


cag 
Gin 


cgt 
Arg 


aaa 
Lys 


etc 
Leu 


gca 
Ala 
20 


age 

Ser 


cga 
Arg 


ctg 
Leu 


atg 
Met 


cct 
Pro 
25 


tct 
Ser 


gaa 
Glu 


caa 
Gin 


tgg 
Trp 


att 
He 


gec 
Ala 


gta 
val 
35 


age 
Ser 


cgt 
Arg 


ggc 
Gly 


ggt 
Gly 


ctg 
Leu 
40 


gta 
val 


ccg 
Pro 


ggt 
Gly 


gcg 
Ala 


tta 
Leu 
45 



tc cat gca 48 
le His Ala 
15 

k aa ggc att 96 
,ys Gly He 
30 

:ta aca cert 144 
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gaa ctg ggt att cgt cat gtc gat acc gtt tgt att tec age tac gat 192 
Glu Leu Gly lie Arg His Val Asp Thr Val Cys lie Ser Ser Tyr Asp 
50 55 60 

cac gac aac cag cgc gag ctt aaa gtg ctg aaa cgc gca gaa ggc gat 24 0 
His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

ggc gaa ggc ttc ate gtt att gat gac ctg gtg gat acc ggt ggt act 288 
Gly Glu Gly Phe lie Val lie Asp Asp Leu Val Asp Thr Gly Gly Thr 

85 90 95 

gcg gtt gcg att cgt gaa atg tat cca aaa gcg cac ttt gtc acc ate 336 
Ala Val Ala lie Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr lie 

100 105 110 

ttc gca aaa ccg get ggt cgt ccg ctg gtt gat gac tat gtt gtt gat 384 
Phe Ala Lys Pro Ala Gly Arg Pro Leu val Asp Asp Tyr Val Val Asp 
115 120 125 

ate ccg caa gat acc tgg att gaa cag ccg tgg gat atg ggc gtc gta 4 32 
lie Pro Gin Asp Thr Trp lie Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

ttc gtc ccg cca ate tec ggt cgc taa 459 
Phe Val Pro Pro lie Ser Gly Arg 
145 150 

<210> 20 
<211> 152 
<212> PRT 

<213> Escherichia coli 
<400> 20 

Met Ser Glu Lys Tyr lie Val Thr Trp Asp Met Leu Gin lie His Ala 
15 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly lie 

20 25 30 

lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly lie Arg His Val Asp Thr Val Cys lie Ser Ser Tyr Asp 
50 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe lie Val lie Asp Asp Leu Val Asp Thr Gly Gly Thr 

85 90 95 

Ala Val Ala lie Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr lie 

100 105 110 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

lie Pro Gin Asp Thr Trp lie Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 



Phe Val Pro Pro He 
145 



Ser Gly Arg 
150 
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ttc gtc ccg cca ate tec ggt cgc taa 
Phe Val Pro Pro He Ser Gly Arg 
145 150 

<210> 22 
<211> 152 
<212> PRT 

<213> Escherichia coli 
<400> 22 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin lie His Ala 
1 5 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 

20 25 30 



48 



96 



144 
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<210> 21 
<211> 459 
<212> DNA 

<213> Escherichia coli 

<220> 
<221> CDS 
<222> (1) . . (456) 

<2 23> coding for xanthine-guanine phosphoribosyl 
transferase (gpt) 

<400> 21 

atg age gaa aaa tac ate gtc acc tgg gac atg ttg cag ate cat gca 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 
X 5 10 15 

cgt aaa etc gca age cga ctg atg cct tct gaa caa tgg aaa ggc att 
Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 

20 25 30 

att gec gta age cgt ggc ggt ctg gta ccg ggt gcg tta ctg gcg cgt 
He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

gaa ctg ggt att cgt cat gtc gat acc gtt tgt att tec age tac gat 
Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 55 60 

cac gac aac cag cgc gag ctt aaa gtg ctg aaa cgc gca gaa ggc gat 
His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

ggc gaa ggc ttc ate gtt att gat gac ctg gtg gat acc ggt ggt act 
Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 

85 90 95 

gcg gtt gcg att cgt gaa atg tat cca aaa gcg cac ttt gtc acc ate 
Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 

100 105 HO 

ttc gca aaa ccg get ggt cgt ccg ctg gtt gat gac tat gtt gtt gat 
Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

ate ccg caa gat acc tgg att gaa cag ccg tgg gat atg ggc gtc gta 432 
He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 



192 



240 



288 



336 



384 



459 
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lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arc/ 
35 40 45 

Glu Leu Gly lie Arg His Val Asp Thr Val Cys lie Ser Ser Tyr Asp 
50 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe lie Val lie Asp Asp Leu Val Asp Thr Gly Gly Thr 

B5 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 

100 105 110 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

lie Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
145 150 

<210> 23 

<211> 720 

<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1)..(717) 

<223> coding for purine nucleoside phosphorylase (deoD) 
<400> 23 

atg get acc cca cac att aat gca gaa atg ggc gat ttc get gac gta 48 
Met Ala Thr Pro His He Asn Ala Glu Met Gly Asp Phe Ala Asp Val 
1 5 lO 15 

gtt ttg atg cca ggc gac ccg ctg cgt gcg aag tat att get gaa act 96 
Val Leu Met Pro Gly Asp Pro Leu Arg Ala Lys Tyr He Ala Glu Thr 

20 25 30 

ttc ctt gaa gat gec cgt gaa gtg aac aac gtt cgc ggt atg ctg ggc 144 
Phe Leu Glu Asp Ala Arg Glu Val Asn Asn Val Arg Gly Met Leu Gly 
35 40 45 

ttc acc ggt act tac aaa ggc cgc aaa att tec gta atg ggt cac ggt 192 
Phe Thr Gly Thr Tyr Lys Gly Arg Lys He Ser Val Met Gly His Gly 
50 55 60 

atg ggt ate ccg tec tgc tec ate tac acc aaa gaa ctg ate acc gat 240 
Met Gly He Pro Ser Cys Ser He Tyr Thr Lys Glu Leu He Thr Asp 
65 70 75 80 

ttc ggc gtg aag aaa att ate cgc gtg ggt tec tgt ggc gca gtt ctg 288 
Phe Gly Val Lys Lys He He Arg Val Gly Ser Cys Gly Ala Val Leu 

85 90 95 

ccg cac gta aaa ctg cgc gac gtc gtt ate ggt atg ggt gec tgc acc 336 
Pro His Val Lys Leu Arg Asp Val Val He Gly Met Gly Ala Cys Thr 

100 105 110 

gat tec aaa gtt aac cgc ate cgt ttt aaa gac cat gac ttt gec get 384 
Asp Ser Lys Val Asn Arg He Arg Phe Lys Asp His Asp Phe Ala Ala 
115 120 125 
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ate get gac ttc gac atg gtg cgt aac gca gta gat gca get aaa gca 
He Ala Asp Phe Asp Met Val Arg Asn Ala Val Asp Ala Ala Lys Ala 
130 135 140 

ctg ggt att gat get cgc gtg ggt aac ctg ttc tec get gac ctg ttc 
Leu Gly He Asp Ala Arg Val Gly Asn Leu Phe Ser Ala Asp Leu Phe 
145 150 155 160 

tac tct ccg gac ggc gaa atg ttc gac gtg atg gaa aaa tac ggc att 
Tyr Ser Pro Asp Gly Glu Met Phe Asp Val Met Glu Lys Tyr Gly He 

165 170 175 

etc ggc gtg gaa atg gaa gcg get ggt ate tac ggc gtc get gca gaa 576 
Leu Gly Val Glu Met Glu Ala Ala Gly He Tyr Gly Val Ala Ala Glu 

180 185 190 

ttt ggc gcg aaa gec ctg acc ate tgc acc gta tct gac cac ate cgc 
Phe Gly Ala Lys Ala Leu Thr He Cys Thr Val Ser Asp His He Arg 
195 200 205 

act cac gag cag acc act gec get gag cgt cag act acc ttc aac gac 
Thr His Glu Gin Thr Thr Ala Ala Glu Arg Gin Thr Thr Phe Asn Asp 
210 215 220 

atg ate aaa ate gca ctg gaa tec gtt ctg ctg ggc gat aaa gag taa 
Met He Lys He Ala Leu Glu Ser Val Leu Leu Gly Asp Lys Glu 
225 230 235 

<210> 24 
<211> 239 
<212> PRT 

<213> Escherichia coli 
<400> 24 

Met Ala Thr Pro His He Asn Ala Glu Met Gly Asp Phe Ala Asp Val 
15 10 15 

Val Leu Met Pro Gly Asp Pro Leu Arg Ala Lys Tyr He Ala Glu Thr 

20 25 30 

Phe Leu Glu Asp Ala Arg Glu Val Asn Asn Val Arg Gly Met Leu Gly 
35 40 45 

Phe Thr Gly Thr Tyr Lys Gly Arg Lys He Ser Val Met Gly His Gly 
50 55 60 

Met Gly He Pro Ser Cys Ser He Tyr Thr Lys Glu Leu He Thr Asp 
65 70 75 80 

Phe Gly Val Lys Lys He He Arg Val Gly Ser Cys Gly Ala Val Leu 

85 90 95 

Pro His Val Lys Leu Arg Asp Val Val He Gly Met Gly Ala Cys Thr 

100 105 HO 

Asp Ser Lys Val Asn Arg He Arg Phe Lys Asp His Asp Phe Ala Ala 
115 120 125 

He Ala Asp Phe Asp Met Val Arg Asn Ala Val Asp Ala Ala Lys Ala 

130 135 140 

Leu Gly He Asp Ala Arg Val Gly Asn Leu Phe Ser Ala Asp Leu Phe 
145 150 155 160 

Tyr Ser Pro Asp Gly Glu Met Phe Asp Val Met Glu Lys Tyr Gly He 

165 170 175 
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Leu Gly Val Glu Met Glu Ala Ala Gly He Tyr Gly Val Ala Ala Glu 

180 185 190 

Phe Gly Ala Lys Ala Leu Thr He Cys Thr Val Ser Asp His lie Arg 
195 200 205 

Thr His Glu Gin Thr Thr Ala Ala Glu Arg Gin Thr Thr Phe Asn Asp 
210 215 220 

Met lie Lys He Ala Leu Glu Ser Val Leu Leu Gly Asp Lys Glu 
225 230 235 



<210> 25 

<211> 1545 

<212> DNA 

<213> Burkholderia caryophylli 

<220> 

<221> CDS 

<222> (1)..(1542) 

<223> coding for phosphonate monoester hydrolase (pehA) 
<400> 25 

atg acc aga aaa aat gtc ctg ctt ate gtc gtt gat caa tgg cga gca 4 8 
Met Thr Arg Lys Asn Val Leu Leu He Val Val Asp Gin Trp Arg Ala 
15 10 15 

gat ttt ate cct cac ctg atg egg gcg gag ggg cgc gaa cct ttc ctt 96 
Asp Phe He Pro His Leu Met Arg Ala Glu Gly Arg Glu Pro Phe Leu 

20 25 30 

aaa act ccc aat ctt gat cgt ctt tgc egg gaa ggc ttg acc ttc cgc 144 
Lys Thr Pro Asn Leu Asp Arg Leu Cys Arg Glu Gly Leu Thr Phe Arg 
35 40 45 

aat cat gtc acg acg tgc gtg ccg tgt ggt ccg gca agg gca age ctg 192 
Asn His Val Thr Thr Cys Val Pro Cys Gly Pro Ala Arg Ala Ser Leu 
50 55 60 

ctg acg ggc etc tac ctg atg aac cac egg gcg gtg cag aac act gtt 240 
Leu Thr Gly Leu Tyr Leu Met Asn His Arg Ala Val Gin Asn Thr Val 
65 70 75 80 

ccg ctt gac cag cgc cat eta aac ctt ggc aag gee ctg cgc gee att 288 
Pro Leu Asp Gin Arg His Leu Asn Leu Gly Lys Ala Leu Arg Ala He 

85 90 95 

ggc tac gat ccc gcg etc att ggt tac acc acc acg aca cct gat ccg 336 
Gly Tyr Asp Pro Ala Leu He Gly Tyr Thr Thr Thr Thr Pro Asp Pro 

100 105 110 

cgc aca acc tct gca agg gat ccg cgt ttc acg gtc ctg ggc gac ate 384 
Arg Thr Thr Ser Ala Arg Asp Pro Arg Phe Thr Val Leu Gly Asp He 
115 120 125 

« 

atg gac ggc ttt cgt teg gtc ggc gca ttc gag ccc aat atg gag ggg 4 32 
Met Asp Gly Phe Arg Ser Val Gly Ala Phe Glu Pro Asn Met Glu Gly 
130 135 140 



tat ttt ggc tgg gtg gcg cag aac ggc ttc gaa ctg cca gag aac cgc 4 80 
Tyr Phe Gly Trp Val Ala Gin Asn Gly Phe Glu Leu Pro Glu Asn Arg 
145 150 155 160 
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528 



576 



624 



672 



720 



768 



816 



864 



912 



gaa gat ate tgg ctg ccg gaa ggt gaa cat tec gtt ccc ggt get ace 
Glu Asp He Trp Leu Pro Glu Gly Glu His Ser Val Pro Gly Ala Thr 

165 170 175 

gac aaa ccg teg cgc att ccg aag gaa ttt teg gat teg aca ttc ttc 
Asp Lys Pro Ser Arg He Pro Lys Glu Phe Ser Asp Ser Thr Phe Phe 

180 1B5 190 

acg gag cgc gec ctg aca tat ctg aag ggc agg gac ggc aag cct ttc 
Thr Glu Arg Ala Leu Thr Tyr Leu Lys Gly Arg Asp Gly Lys Pro Phe 
195 200 205 

ttc ctg cat ctt ggc tat tat cgc ccg cat ccg cct ttc gta gec tec 
Phe Leu His Leu Gly Tyr Tyr Arg Pro His Pro Pro Phe Val Ala Ser 

210 215 220 

gcg ccc tac cat gcg atg tac aaa gee gaa gat atg cct gcg cct ata 
Ala Pro Tyr His Ala Met Tyr Lys Ala Glu Asp Met Pro Ala Pro He 
225 230 235 240 

cat gcg gag aat ccg gat gec gaa gcg gca cag cat ccg etc atg aag 
Arq Ala Glu Asn Pro Asp Ala Glu Ala Ala Gin His Pro Leu Met Lys 

245 250 255 

cac tat ate gac cac ate aga cgc ggc teg ttc ttc cat ggc gcg gaa 
His Tyr He Asp His He Arg Arg Gly Ser Phe Phe His Gly Ala Glu 

260 265 270 

ggc teg gga gca acg ctt gat gaa ggc gaa att cgc cag atg cgc get 
Gly Ser Gly Ala Thr Leu Asp Glu Gly Glu lie Arg Gin Met Arg Ala 

275 280 285 

aca tat tgc gga ctg ate acc gag ate gac gat tgt ctg ggg agg gtc 
Thr Tyr Cys Gly Leu He Thr Glu He Asp Asp Cys Leu Gly Arg Val 

290 295 300 

ttt gee tat etc gat gaa acc ggt cag tgg gac gac acg ctg att ate 
Phe Ala Tyr Leu Asp Glu Thr Gly Gin Trp Asp Asp Thr Leu He lie 
305 310 315 320 

ttc acg age gat cat ggc gaa caa ctg ggc gat cat cac ctg etc ggc 
Phe Thr Ser Asp His Gly Glu Gin Leu Gly Asp His His Leu Leu Gly 

325 330 335 

aag ate ggt tac aat gec gaa age ttc cgt att ccc ttg gtc ata aag 
Lys He Gly Tyr Asn Ala Glu Ser Phe Arg He Pro Leu Val He Lys 

340 345 350 

gat gcg gga cag aac egg cac gee ggc cag ate gaa gaa ggc ttc tec 
Asp Ala Gly Gin Asn Arg His Ala Gly Gin He Glu Glu Gly Phe Ser 

355 360 365 

gaa age ate gac gtc atg ccg acc ate etc gaa tgg ctg ggc ggg gaa 
Glu Ser He Asp Val Met Pro Thr He Leu Glu Trp Leu Gly Gly Glu 

370 375 380 

acg cct cgc gec tgc gac ggc cgt teg ctg ttg ccg ttt ctg get gag 
Thr Pro Arg Ala Cys Asp Gly Arg Ser Leu Leu Pro Phe Leu Ala Glu 
385 390 395 400 

gga aag ccc tec gac tgg cgc acg gaa eta cat tac gag ttc gat ttt 
Gly Lys Pro Ser Asp Trp Arg Thr Glu Leu His Tyr Glu Phe Asp Phe 

405 410 415 

cgc gat gtc ttc tac gat cag ccg cag aac teg gtc cag ctt tec cag 1296 
Arg Asp Val Phe Tyr Asp Gin Pro Gin Asn Ser Val Gin Leu Ser Gin 

420 425 430 
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gat gat tgc age etc tgt gtg ate gag gac gaa aac tac aag tac gtg 1344 
Asp Asp Cys Ser Leu Cys Val lie Glu Asp Glu Asn Tyr Lys Tyr Val 
435 440 445 

cat ttt gec gee ctg ccg ccg ctg ttc ttc gat ctg aag gca gac ccg 1392 
His Phe Ala Ala Leu Pro Pro Leu Phe Phe Asp Leu Lys Ala Asp Pro 
450 455 460 

cat gaa ttc age aat ctg get ggc gat cct get tat gcg gec etc gtt 1440 
His Glu Phe Ser Asn Leu Ala Gly Asp Pro Ala Tyr Ala Ala Leu Val 
465 470 475 480 

cgt gac tat gec cag aag gca ttg teg tgg cga ctg tct cat gee gac 14 BB 
Arg Asp Tyr Ala Gin Lys Ala Leu Ser Trp Arg Leu Ser His Ala Asp 

485 490 495 

egg aca etc ace cat tac aga tec age ccg caa ggg ctg aca acg cgc 1536 
Arg Thr Leu Thr His Tyr Arg Ser Ser Pro Gin Gly Leu Thr Thr Arg 

500 505 510 

aac cat tga 1545 
Asn His 

<210> 26 
<211> 514 
<212> PRT 

<213> Burkholderia caryophylli 
<400> 26 

Met Thr Arg Lys Asn Val Leu Leu lie Val Val Asp Gin Trp Arg Ala 
1 5 10 15 

Asp Phe lie Pro His Leu Met Arg Ala Glu Gly Arg Glu Pro Phe Leu 

20 25 30 

Lys Thr Pro Asn Leu Asp Arg Leu Cys Arg Glu Gly Leu Thr Phe Arg 
35 40 45 

Asn His Val Thr Thr Cys Val Pro Cys Gly Pro Ala Arg Ala Ser Leu 
50 55 60 

Leu Thr Gly Leu Tyr Leu Met Asn His Arg Ala Val Gin Asn Thr Val 
65 70 75 80 

Pro Leu Asp Gin Arg His Leu Asn Leu Gly Lys Ala Leu Arg Ala lie 

85 90 95 

Gly Tyr Asp Pro Ala Leu lie Gly Tyr Thr Thr Thr Thr Pro Asp Pro 

100 105 110 

Arg Thr Thr Ser Ala Arg Asp Pro Arg Phe Thr Val Leu Gly Asp He 
115 120 125 

Met Asp Gly Phe Arg Ser Val Gly Ala Phe Glu Pro Asn Met Glu Gly 
130 135 140 

Tyr Phe Gly Trp Val Ala Gin Asn Gly Phe Glu Leu Pro Glu Asn Arg 
145 150 155 160 

Glu Asp lie Trp Leu Pro Glu Gly Glu His Ser Val Pro Gly Ala Thr 

165 170 , 175 

Asp Lys Pro Ser Arg He Pro Lys Glu Phe Ser Asp Ser Thr Phe Phe 

180 185 190 

Thr Glu Arg Ala Leu Thr Tyr Leu Lys Gly Arg Asp Gly Lys Pro Phe 
195 200 205 
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Phe Leu His Leu Gly Tyr Tyr Arg Pro His Pro Pro Phe Val Ala Ser 

210 215 220 

Ala Pro Tyr His Ala Met Tyr Lys Ala Glu Asp Met Pro Ala Pro lie 
225 230 235 

Arcj Ala Glu Asn Pro Asp Ala Glu Ala Ala Gin His Pro Leu Met Lys 

245 250 255 

His Tyr lie Asp His lie Arg Arg Gly Ser Phe Phe His Gly Ala Glu 

260 265 270 

Glv Ser Gly Ala Thr Leu Asp Glu Gly Glu lie Arg Gin Met Arg Ala 

275 280 285 

Thr Tyr Cys Gly Leu lie Thr Glu lie Asp Asp Cys Leu Gly Arg Val 

290 295 300 

Phe Ala Tyr Leu Asp Glu Thr Gly Gin Trp Asp Asp Thr Leu lie lie 
305 310 315 

Phe Thr Ser Asp His Gly Glu Gin Leu Gly Asp His His Leu Leu Gly 

325 330 335 

Lvs lie Gly Tyr Asn Ala Glu Ser Phe Arg lie Pro Leu Val lie Lys 

340 345 350 

Asp Ala Gly Gin Asn Arg His Ala Gly Gin He Glu Glu Gly Phe Ser 
355 360 365 

Glu Ser lie Asp Val Met Pro Thr He Leu Glu Trp Leu Gly Gly Glu 

370 375 380 

Thr Pro Arg Ala Cys Asp Gly Arg Ser Leu Leu Pro Phe Leu Ala Glu 
385 390 395 

Gly Lys Pro Ser Asp Trp Arg Thr Glu Leu His Tyr Glu Phe Asp Phe 

405 410 
Arg Asp Val Phe Tyr Asp Gin Pro Gin Asn Ser Val Gin Leu Ser Gin 

420 425 430 

Asp Asp Cys Ser Leu Cys Val He Glu Asp Glu Asn Tyr Lys Tyr Val 

435 440 445 

His Phe Ala Ala Leu Pro Pro Leu Phe Phe Asp Leu Lys Ala Asp Pro 

450 455 460 

His Glu Phe Ser Asn Leu Ala Gly Asp Pro Ala Tyr Ala Ala Leu Val 
465 470 475 480 

Arg Asp Tyr Ala Gin Lys Ala Leu Ser Trp Arg Leu Ser His Ala Asp 

485 490 
Arg Thr Leu Thr His Tyr Arg Ser Ser Pro Gin Gly Leu Thr Thr Arg 
y 500 505 510 

Asn His 



<210> 27 
<211> 2250 
<212> DNA 

<213> Agrobacteriuin rhizogenes 

<220> 
<221> CDS 
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<222> (1) - - (2247) 

<223> coding for tryptophan oxygenase (auxl) 
<400> 27 

atg get gga tec tec ttc aca ttg cca tea act ggc tea gcg ccc ctt 48 
Met Ala Gly Ser Ser Phe Thr Leu Pro Ser Thr Gly Ser Ala Pro Leu 
15 10 15 

gat atg atg ctt ate gat gat tea gat ctg ctg caa ttg ggt etc cag 96 
Asp Met Met Leu lie Asp Asp Ser Asp Leu Leu Gin Leu Gly Leu Gin 

20 25 30 

cag gta ttc teg aag egg tac aca gag aca ccg cag tea cgc tac aaa 144 
Gin Val Phe Ser Lys Arg Tyr Thr Glu Thr Pro Gin Ser Arg Tyr Lys 
35 40 45 

ctg ace agg agg get tct cca gac gtc tea tct ggc gaa ggc aat gtg 192 
Leu Thr Arg Arg Ala Ser Pro Asp Val Ser Ser Gly Glu Gly Asn Val 
50 55 60 

cat gee ctt gcg ttc ata tat gtc aac get gag acg ttg cag atg ate 240 
His Ala Leu Ala Phe lie Tyr Val Asn Ala Glu Thr Leu Gin Met lie 
65 70 75 80 

aaa aac get cga teg eta acc gaa gcg aac ggc gtc aaa gat ctt gtc 288 
Lys Asn Ala Arg Ser Leu Thr Glu Ala Asn Gly Val Lys Asp Leu Val 

85 90 95 

gec ate gac gtt ccg cca ttt cga aac gac ttc tea aga gcg eta etc 336 
Ala lie Asp Val Pro Pro Phe Arg Asn Asp Phe Ser Arg Ala Leu Leu 

100 105 110 

ctt caa gtg ate aac ttg ttg gga aac aac cga aat gec gat gac gat 384 
Leu Gin Val lie Asn Leu Leu Gly Asn Asn Arg Asn Ala Asp Asp Asp 
115 120 125 

ctt agt cac ttc ata gca gtt get etc cca aac age gec cgc tct aag 432 
Leu Ser His Phe lie Ala Val Ala Leu Pro Asn Ser Ala Arg Ser Lys 
130 135 140 

ate eta acc acg gca ccg ttc gaa gga age ttg tea gaa aac ttc agg 480 
lie Leu Thr Thr Ala Pro Phe Glu Gly Ser Leu Ser Glu Asn Phe Arg 
145 150 155 160 

ggg ttc ccg ate act cgt gaa gga aat gtg gca tgt gaa gtg eta gec 528 
Gly Phe Pro lie Thr Arg Glu Gly Asn Val Ala Cys Glu Val Leu Ala 

165 170 175 

tat ggg aat aac ttg atg ccc aag gee tgc tec gat tec ttt cca acc 576 
Tyr Gly Asn Asn Leu Met Pro Lys Ala Cys Ser Asp Ser Phe Pro Thr 

180 1B5 190 

gtg gat ctt ctt tat gac tat ggc aag ttc ttc gag agt tgc gcg gec 624 
Val Asp Leu Leu Tyr Asp Tyr Gly Lys Phe Phe Glu Ser Cys Ala Ala 
195 200 205 

gat gga cgt ate ggt tat ttt cct gaa ggc gtt acg aaa cct aaa gtg 672 
Asp Gly Arg lie Gly Tyr Phe Pro Glu Gly Val Thr Lys Pro Lys Val 
210 215 220 

get ata att ggc gca ggc ttt tec ggg etc gtt gca gcg age gaa eta 720 
Ala lie lie Gly Ala Gly Phe Ser Gly Leu Val Ala Ala Ser Glu Leu 
225 230 235 240 

ctt cat gca ggg gta gac gat gtt acg gtg tat gag gcg agt gat egg 768 
Leu His Ala Gly Val Asp Asp Val Thr Val Tyr Glu Ala Ser Asp Arg 

245 250 255 
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ctt gga gga aag eta tgg tea cac gga ttt aag agt get cca aat gtg 
Leu Gly Gly Lys Leu Trp Ser His Gly Phe Lys Ser Ala Pro Asn Val 

260 265 270 

ata gcc gag atg ggg gcc atg cgt ttt ccg cga agt gaa tea tgc ttg 
He Ala Glu Met Gly Ala Met Arg Phe Pro Arg Ser Glu Ser Cys Leu 

275 280 285 

ttc ttc tat etc aaa aag cac gga ctg gac tec gtt ggt ctg ttc ccg 
Phe Phe Tyr Leu Lys Lys His Gly Leu Asp Ser Val Gly Leu Phe Pro 

290 295 300 

aat ccg gga agt gtc gat acc gca ttg ttc tac agg ggc cgt caa tat 
Asn Pro Gly Ser Val Asp Thr Ala Leu Phe Tyr Arg Gly Arg Gin Tyr 
305 310 315 320 

ate tgg aaa gcg gga gag gag cca ccg gag ctg ttt cgt cgt gtg cac 
He Trp Lys Ala Gly Glu Glu Pro Pro Glu Leu Phe Arg Arg Val Hxs 

325 330 335 

cat gga tgg cgc gca ttt ttg caa gat ggc tat etc cat gat gga gtc 
His Gly Trp Arg Ala Phe Leu Gin Asp Gly Tyr Leu His Asp Gly Val 

340 345 350 

atg ttg gcg tea ccg tta gca att gtt gac gcc ttg aat tta ggg cat 
Met Leu Ala Ser Pro Leu Ala He Val Asp Ala Leu Asn Leu Gly His 

355 360 365 

eta cag cag gcg cat ggc ttc tgg caa tct tgg etc aca tat ttt gag 1152 
Leu Gin Gin Ala His Gly Phe Trp Gin Ser Trp Leu Thr Tyr Phe Glu 

. 370 375 380 

cga gag tct ttc tct tct ggc ate gaa aaa atg ttc ttg ggc aat cat 
Arg Glu Ser Phe Ser Ser Gly He Glu Lys Met Phe Leu Gly Asn Hxs 
3sl 390 395 400 

cct ccg ggg ggt gaa caa tgg aat tec eta gat gac ttg gat ctt ttc 
Pro Pro Gly Gly Glu Gin Trp Asn Ser Leu Asp Asp Leu Asp Leu Phe 

405 410 415 

aaa gcg ctg ggt att gga tec ggc gga ttc ggc cct gta ttt gaa agt 
Lvs Ala Leu Gly He Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser 

420 425 430 

ggg ttt ate gag ate ctt cgc tta gtc gtc aac ggg tat gag gat aac 1344 
Gly Phe He Glu He Leu Arg Leu Val Val Asn Gly Tyr Glu Asp Asn 

435 440 445 

gtg egg ctg agt tac gaa gga att tct gag ctg cct cat agg ate gcc 
Val Arg Leu Ser Tyr Glu Gly He Ser Glu Leu Pro His Arg He Ala 

450 455 460 

tea cag gta att aac ggc aga tct att cgc gag cgt aca att cac gtt 
Ser Gin Val He Asn Gly Arg Ser He Arg Glu Arg Thr He His Val 
465 470 475 480 

caa gtc gag cag att gat aga gag gag gat aaa ata aat ate aag ate 
Gin Val Glu Gin He Asp Arg Glu Glu Asp Lys He Asn He Lys He 

485 490 495 

aaa gga gga aag gtt gag gtc tat gat cga gta ctg gtt aca tec ggg 
Lys Gly Gly Lys Val Glu Val Tyr Asp Arg Val Leu Val Thr Ser Gly 

500 505 510 

ttt gcg aac ate gaa atg cgc cat etc ctg aca tea age aac gca ttc 
Phe Ala Asn He Glu Met Arg His Leu Leu Thr Ser Ser Asn Ala Phe 
515 520 525 
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ttc cat gca gat gta age cat gca ata ggg aac agt cat atg act ggt 
Phe His Ala Asp Val Ser His Ala lie Gly Asn Ser His Met Thr Gly 
530 535 540 

gcg tea aaa ctg ttc ttg ctg act aac gaa aaa ttc tgg eta caa cat 
Ala Ser Lys Leu Phe Leu Leu Thr Asn Glu Lys Phe Trp Leu Gin His 
545 550 555 560 

cat ttg cca teg tgc ata etc acc ace ggc gtt gca aag gca gtt tat 
His Leu Pro Ser Cys lie Leu Thr Thr Gly Val Ala Lys Ala Val Tyr 

565 570 575 

tgc tta gac tat gat ccg cga gat cca age ggc aaa gga ctg gtg ttg 
Cys Leu Asp Tyr Asp Pro Arg Asp Pro Ser Gly Lys Gly Leu Val Leu 

580 585 590 

ata age tat act tgg gag gat gac tea cat aag etc eta gee gtc ccc 
He Ser Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro 
595 600 605 

gac aaa aga gaa agg ttc gca teg ctg cag cgc gat att ggg agg gca 
Asp Lys Arg Glu Arg Phe Ala Ser Leu Gin Arg Asp He Gly Arg Ala 
610 615 620 

ttc cca gat ttt gec aag cac eta act cct gca gac ggg aac tat gat 
Phe Pro Asp Phe Ala Lys His Leu Thr Pro Ala Asp Gly Asn Tyr Asp 
625 630 635 640 

gat aat ate gtt caa cat gat tgg ctg act gat ccc cac get ggc gga 
Asp Asn He Val Gin His Asp Trp Leu Thr Asp Pro His Ala Gly Gly 

645 650 655 

gcg ttt aaa ctg aac cgc aga ggc aac gac gta tat tea gaa agg ctt 
Ala Phe Lys Leu Asn Arg Arg Gly Asn Asp Val Tyr Ser Glu Arg Leu 

660 665 " 67 0 



1632 



ttc ttt cag ccc ttt 
Phe Phe Gin Pro Phe 
675 

tac ttg gec ggt tgt 
Tyr Leu Ala Gly Cys 
690 



gta atg cat ccc gcg gac gat aag gga ctt 
Val Met His Pro Ala Asp Asp Lys Gly Leu 
6B0 685 

tgt tec ttc acc gga ggg tgg gtt cat ggt 
Cys Ser Phe Thr Gly Gly Trp Val His Gly 
695 700 



gee att cag acc gca tgc aac get acg tgt gcg ate att tat ggt tec 
Ala He Gin Thr Ala Cys Asn Ala Thr Cys Ala He He Tyr Gly Ser 
705 710 715 720 

gga cac ctg caa gag eta ate cac tgg cga cac etc aaa gaa ggt aat 
Gly His Leu Gin Glu Leu He His Trp Arg His Leu Lys Glu Gly Asn 

725 730 735 

cca ctg gcg cac get tgg aag egg tat agg tat caa gcg tga 
Pro Leu Ala His Ala Trp Lys Arg Tyr Arg Tyr Gin Ala 

740 745 

<210> 28 
<211> 749 
<212> PRT 

<213> Agrobacterium rhizogenes 



<400> 28 

Met Ala Gly Ser Ser Phe Thr Leu Pro Ser Thr Gly 
15 10 



Ala Pro Leu 
15 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2250 
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Asp Met Met Leu lie Asp Asp Ser Asp Leu Leu Gin Leu Gly Leu Gin 

20 25 30 

Gin Val Phe Ser Lys Arg Tyr Thr Glu Thr Pro Gin Ser Arg Tyr Lys 
35 40 45 

Leu Thr Arg Arg Ala Ser Pro Asp Val Ser Ser Gly Glu Gly Asn Val 

50 55 60 

His Ala Leu Ala Phe lie Tyr Val Asn Ala Glu Thr Leu Gin Met lie 
65 70 75 BO 

Lys Asn Ala Arg Ser Leu Thr Glu Ala Asn Gly Val Lys Asp Leu Val 

85 90 95 

Ala lie Asp Val Pro Pro Phe Arg Asn Asp Phe Ser Arg Ala Leu Leu 

100 105 HO 

Leu Gin Val lie Asn Leu Leu Gly Asn Asn Arg Asn Ala Asp Asp Asp 
115 120 125 

Leu Ser His Phe lie Ala Val Ala Leu Pro Asn Ser Ala Arg Ser Lys 

130 135 140 

He Leu Thr Thr Ala Pro Phe Glu Gly Ser Leu Ser Glu Asn Phe Arg 
145 150 155 160 

Glv Phe Pro He Thr Arg Glu Gly Asn Val Ala Cys Glu Val Leu Ala 

165 170 175 

Tyr Gly Asn Asn Leu Met Pro Lys Ala Cys Ser Asp Ser Phe Pro Thr 

180 185 190 

Val Asp Leu Leu Tyr Asp Tyr Gly Lys Phe Phe Glu Ser Cys Ala Ala 
ig5 200 205 

Asp Gly Arg He Gly Tyr Phe Pro Glu Gly Val Thr Lys Pro Lys Val 

210 215 220 

Ala He He Gly Ala Gly Phe Ser Gly Leu Val Ala Ala Ser Glu Leu 
225 230 235 240 

Leu His Ala Gly Val Asp Asp Val Thr Val Tyr Glu Ala Ser Asp Arg 

245 250 255 

Leu Gly Gly Lys Leu Trp Ser His Gly Phe Lys Ser Ala Pro Asn Val 

260 265 270 

He Ala Glu Met Gly Ala Met Arg Phe Pro Arg Ser Glu Ser Cys Leu 

275 280 285 

Phe Phe Tyr Leu Lys Lys His Gly Leu Asp Ser Val Gly Leu Phe Pro 

290 295 300 

Asn Pro Gly Ser Val Asp Thr Ala Leu Phe Tyr Arg Gly Arg Gin Tyr 
305 310 315 320 

He Trp Lys Ala Gly Glu Glu Pro Pro Glu Leu Phe Arg Arg Val His 

325 330 335 

His Gly Trp Arg Ala Phe Leu Gin Asp Gly Tyr Leu His Asp Gly Val 

340 345 350 

Met Leu Ala Ser Pro Leu Ala He Val Asp Ala Leu Asn Leu Gly His 
355 360 365 

Leu Gin Gin Ala His Gly Phe Trp Gin Ser Trp Leu Thr Tyr Phe Glu 
370 375 380 
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Arg Glu Ser Phe 
385 

Pro Pro Gly Gly 



Lys Ala Leu Gly 

420 

Gly Phe He Glu 
435 

Val Arg Leu Ser 
450 

Ser Gin Val He 
465 

Gin Val Glu Gin 



Lys Gly Gly Lys 

500 

Phe Ala Asn lie 
515 

Phe His Ala Asp 
530 

Ala Ser Lys Leu 
545 

His Leu Pro Ser 



Cys Leu Asp Tyr 

580 

He Ser Tyr Thr 
595 

Asp Lys Arg Glu 
610 

Phe Pro Asp Phe 
625 

Asp Asn He Val 



Ala Phe Lys Leu 

660 

Phe Phe Gin Pro 
675 

Tyr Leu Ala Gly 
690 

Ala He Gin Thr 
705 

Gly His Leu Gin 



Pro Leu Ala His 

740 



Ser Ser Gly He 
390 

Glu Gin Trp Asn 
405 

He Gly Ser Gly 



He Leu Arg Leu 

440 

Tyr Glu Gly He 
455 

Asn Gly Arg Ser 
470 

He Asp Arg Glu 
485 

Val Glu Val Tyr 



Glu Met Arg His 

520 

Val Ser His Ala 
535 

Phe Leu Leu Thr 
550 

Cys He Leu Thr 
565 

Asp Pro Arg Asp 



Trp Glu Asp Asp 

600 

Arg Phe Ala Ser 
615 

Ala Lys His Leu 
630 

Gin His Asp Trp 
645 

Asn Arg Arg Gly 



Phe Asp Val Met 

680 

Cys Ser Cys Ser 
695 

Ala Cys Asn Ala 
710 

Glu Leu He His 
725 

Ala Trp Lys Arg 



39 

Glu Lys Met Phe 
395 

Ser Leu Asp Asp 
410 

Gly Phe Gly Pro 
425 

Val Val Asn Gly 

Ser Glu Leu Pro 

460 

He Arg Glu Arg 
475 

Glu Asp Lys He 
490 

Asp Arg Val Leu 
505 

Leu Leu Thr Ser 



He Gly Asn Ser 

540 

Asn Glu Lys Phe 
555 

Thr Gly Val Ala 
570 

Pro Ser Gly Lys 
585 

Ser His Lys Leu 



Leu Gin Arg Asp 

620 

Thr Pro Ala Asp 
635 

Leu Thr Asp Pro 
650 

Asn Asp Val Tyr 
665 

His Pro Ala Asp 

Phe Thr Gly Gly 

700 

Thr Cys Ala He 
715 

Trp Arg His Leu 
730 

Tyr Arg Tyr Gin 
745 



Leu Gly Asn His 

400 

Leu Asp Leu Phe 
415 

Val Phe Glu Ser 
430 

Tyr Glu Asp Asn 
445 

His Arg He Ala 



Thr He His Val 

480 

Asn He Lys He 
495 

Val Thr Ser Gly 
510 

Ser Asn Ala Phe 
525 

His Met Thr Gly 

Trp Leu Gin His 

560 

Lys Ala Val Tyr 
575 

Gly Leu Val Leu 
590 

Leu Ala Val Pro 
605 

He Gly Arg Ala 



Gly Asn Tyr Asp 

640 

His Ala Gly Gly 
655 

Ser Glu Arg Leu 
670 

Asp Lys Gly Leu 
685 

Trp Val His Gly 



He Tyr Gly Ser 

720 

Lys Glu Gly Asn 
735 

Ala 
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<210> 29 
<211> 1401 
<212> DNA 

<213> Agrobacterium rhizogenes 

<220> 

<221> CDS 

<222> (1)..(1398) 

<223> coding for indoleacetamide hydrolase 
<400> 29 

atg gtg acc etc tec teg ate acc gag acg ctt aaa tgt etc agg gaa 
Met Val Thr Leu Ser Ser He Thr Glu Thr Leu Lys Cys Leu Arg Glu 
15 10 15 

aga aaa tac teg tgc ttt gag tta ate gaa acg ata ata gee cgc tgt 
Arg Lys Tyr Ser Cys Phe Glu Leu He Glu Thr lie He Ala Arg Cys 

20 25 30 

gaa gca gca aga tec tta aac gee ttt ctg gaa acc gac tgg gcg cac 
Glu Ala Ala Arg Ser Leu Asn Ala Phe Leu Glu Thr Asp Trp Ala His 
35 40 45 

eta egg tgg act gee age aaa ate gat caa cac gga ggt gee ggt gtt 
Leu Arg Trp Thr Ala Ser Lys He Asp Gin His Gly Gly Ala Gly Val 
50 55 60 

ggc eta get ggc gtt ccc eta tgc ttt aaa gcg aat att gcg aca ggc 
Glv Leu Ala Gly Val Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

agg ttc gec gcg acc get ggt acg cca ggc tta cag aac cac aaa ccc 
Arq Phe Ala Ala Thr Ala Gly Thr Pro Gly Leu Gin Asn His Lys Pro 

85 90 95 

aag acg cct gee gga gtt gca cga caa ctt etc gcg get ggg gca ctg 
Lys Thr Pro Ala Gly Val Ala Arg Gin Leu Leu Ala Ala Gly Ala Leu 

100 105 HO 

cct ggc get teg gga aac atg cac gaa ttg tct ttt ggg ate acg age 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

aac aac ttc gec aca ggc gee gta cga aac ccg tgg aac cct agt etc 
Asn Asn Phe Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

ate cca ggg gga tea agt ggg ggt gtg gec gec gcg gtg gec ggc cga 
He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Gly Arg 
145 150 155 160 

ttg atg ctg ggc ggc gtc gga act gac acg gga gcg teg gtc cgt tta 
Leu Met Leu Gly Gly Val Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

ccg gee gec ttg tgc ggc gtg gtg ggg ttt cgt cct acc gtg ggg cga 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Val Gly Arg 

180 185 190 

tat cca acg gac gga ata gtt ccg gta age ccc acc egg gac acc cct 
Tyr Pro Thr Asp Gly He Val Pro Val Ser Pro Thr Arg Asp Thr Pro 
195 200 205 



48 



96 



144 



192 



240 



2B8 



336 



384 



432 



480 



528 



576 



624 
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ggc gtt ate gca cag aat gtt ccg gac gtg att ctt ctt gac ggt ate 67 2 
Gly Val He Ala Gin Asn Val Pro Asp Val He Leu Leu Asp Gly He 
210 215 220 

att tgc ggg aga ccg ccg gtt aat caa acg gtc cgc ctg aag ggg ctg 720 
He Cys Gly Arg Pro Pro Val Asn Gin Thr Val Arg Leu Lys Gly Leu 
225 230 235 240 

cgt ata ggc ttg cca acc get tac ttt tac aac gac ctg gag ccc gat 768 
Arg He Gly Leu Pro Thr Ala Tyr Phe Tyr Asn Asp Leu Glu Pro Asp 

245 250 255 

gtc gec tta gca gec gag acg att ate aga gtt ctg gca cgc aaa gat 816 
Val Ala Leu Ala Ala Glu Thr He He Arg Val Leu Ala Arg Lys Asp 

260 265 270 

gtt act ttt gtt gaa gca gat att cct gat tta gcg cat cac aat gaa 864 
Val Thr Phe Val Glu Ala Asp He Pro Asp Leu Ala His His Asn Glu 
275 280 285 

ggg gtc age ttt ccg act gec ate tac gaa ttt ccg ttg; -tec ctt gaa 912 
Gly Val Ser Phe Pro Thr Ala He Tyr Glu Phe Pro Leu Ser Leu Glu 
290 295 300 

cat tat att cag aac ttc gta gag ggt gtt tec ttt tct gag gtt gtc 960 
His Tyr He Gin Asn Phe Val Glu Gly Val Ser Phe Ser Glu Val Val 
305 310 315 320 

aga gcg att cgc agt ccg gat gtt gca agt att etc aat gca caa etc 1008 
Arg Ala He Arg Ser Pro Asp Val Ala Ser He Leu Asn Ala Gin Leu 

325 330 335 

teg gat aat ctt att tec aaa age gag tat tgt ctg gcg cga cgt ttt 1056 
Ser Asp Asn Leu He Ser Lys Ser Glu Tyr Cys Leu Ala Arg Arg Phe 

340 345 350 

ttc aga ccg aga etc caa gcg gee tac cac agt tac ttc aag gcg cat 1104 
Phe Arg Pro Arg Leu Gin Ala Ala Tyr His Ser Tyr Phe Lys Ala His 
355 360 365 

cag eta gat gca att ctt ttc cca aca get ccg ttg aca gee aag cca 1152 
Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

att ggc cat gat eta teg gtg att cac aat ggc tea atg acc gat acc 1200 
He Gly His Asp Leu Ser Val He His Asn Gly Ser Met Thr Asp Thr 
385 390 395 400 

ttt aaa ate ttc gtg egg aat gta gat ccc age agt aat gcg ggc ctg 1248 
Phe Lys He Phe Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

ccg ggc eta agt ctt ccc gtt tct ctt agt tec aac ggt ctg cct att 12 96 
Pro Gly Leu Ser Leu Pro Val Ser Leu Ser Ser Asn Gly Leu Pro He 

420 425 430 

ggc atg gaa ate gat ggc tct gca age teg gat gaa cgt ctg tta gca 1344 
Gly Met Glu He Asp Gly Ser Ala Ser Ser Asp Glu Arg Leu Leu Ala 
435 440 445 

att gga eta gcg ata gaa gaa gca ata gac ttt agg cat cgt ccg act 13 92 
He Gly Leu Ala He Glu Glu Ala lie Asp Phe Arg His Arg Pro Thr 
450 455 460 

ctg teg taa 14 01 

Leu Ser 

465 
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<210> 30 
<211> 466 
<212> PRT 

<213> Agrobacteri/um rhizogenes 

Met Val Thr Leu Ser Ser lie Thr Glu Thr Leu Lys Cys Leu Arg Glu 

1 5 10 15 

Arg Lys Tyr Ser Cys Phe Glu Leu He Glu Thr He He Ala Arg Cys 

20 25 30 

Glu Ala Ala Arg Ser Leu Asn Ala Phe Leu Glu Thr Asp Trp Ala His 

35 40 45 

Leu Arg Trp Thr Ala Ser Lys He Asp Gin His Gly Gly Ala Gly Val 

50 55 60 

Gly Leu Ala Gly Val Pro Leu Cys Phe Lys Ala Asa He Ala Thr Gly 

65 70 75 

Arg Phe Ala Ala Thr Ala Gly Thr Pro Gly Leu Gin Asn His Lys Pro 

85 90 9 

Lys Thr Pro Ala Gly Val Ala Arg Gin Leu Leu Ala Ala Gly Ala Leu 

100 105 11° 

Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

Asn Asn Phe Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Gly Arg 
145 150 155 160 

Leu Met Leu Gly Gly Val Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Val Gly Arg 

180 185 I* 0 

Tvr Pro Thr Asp Gly He Val Pro Val Ser Pro Thr Arg Asp Thr Pro 
195 200 205 

Glv Val He Ala Gin Asn Val Pro Asp Val He Leu Leu Asp Gly He 

210 215 220 

He Cys Gly Arg Pro Pro Val Asn Gin Thr Val Arg Leu Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Ala Tyr Phe Tyr Asn Asp Leu Glu Pro Asp 
* 245 250 255 

Val Ala Leu Ala Ala Glu Thr He He Arg Val Leu Ala Arg Lys Asp 

260 265 270 

Val Thr Phe Val Glu Ala Asp He Pro Asp Leu Ala His His Asn Glu 

275 280 285 

Gly Val Ser Phe Pro Thr Ala He Tyr Glu Phe Pro Leu Ser Leu Glu 

290 295 300 

His Tyr He Gin Asn Phe Val Glu Gly Val Ser Phe Ser Glu Val Val 
305 310 315 320 

Arg Ala He Arg Ser Pro Asp Val Ala Ser He Leu Asn Ala Gin Leu 

325 330 
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Ser Asp Asn Leu lie Ser Lys Ser Glu Tyr Cys Leu Ala Arg Arg Phe 

340 345 350 

Phe Arg Pro Arg Leu Gin Ala Ala Tyr His Ser Tyr Phe Lys Ala His 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

He Gly His Asp Leu Ser Val He His Asn Gly Ser Met Thr Asp Thr 
385 390 395 400 

Phe Lys He Phe Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

Pro Gly Leu Ser Leu Pro Val Ser Leu Ser Ser Asn Gly Leu Pro He 

420 425 430 

Gly Met Glu He Asp Gly Ser Ala Ser Ser Asp Glu Arg Leu Leu Ala 
435 440 445 

He Gly Leu Ala He Glu Glu Ala He Asp Phe Arg His Arg Pro Thr 
450 455 460 

Leu Ser 
465 

<210> 31 
<211> 2268 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 

<221> CDS 

<222> (1)..(2265) 

<223> coding for tryptophan monooxygenase 
<400> 31 

atg tea get tea cct etc ctt gat aac cag tgc gat cat ttc tct acc 48 
Met Ser Ala Ser Pro Leu Leu Asp Asn Gin Cys Asp His Phe Ser Thr 
1 5 10 15 

aaa atg gtg gat ctg ata atg gtc gat aag get gat gaa ttg gac cgc 96 
Lys Met Val Asp Leu He Met Val Asp Lys Ala Asp Glu Leu Asp Arg 

20 25 30 

agg gtt tec gat gee ttc tea gaa cgt gaa get tct agg gga agg agg 144 
Arg Val Ser Asp Ala Phe Ser Glu Arg Glu Ala Ser Arg Gly Arg Arg 
35 40 45 

att act caa ate tec ggc gag tgc age get ggg tta get tgc aaa agg 192 
lie Thr Gin He Ser Gly Glu Cys Ser Ala Gly Leu Ala Cys Lys Arg 
50 55 60 

ctg gec gac ggt cgc ttt ccc gag ate tea act ggt gag aag gta gca 240 
Leu Ala Asp Gly Arg Phe Pro Glu He Ser Thr Gly Glu Lys Val Ala 
65 70 75 80 

gee etc tec get tac ate tat gtt ggc aag gaa att ctg ggg egg ata 288 
Ala Leu Ser Ala Tyr lie Tyr Val Gly Lys Glu He Leu Gly Arg He 

85 90 95 

ctt gaa teg gaa cct tgg gcg cga gca aga gtg agt ggt etc gtt gec 336 
Leu Glu Ser Glu Pro Trp Ala Arg Ala Arg Val Ser Gly Leu Val Ala 

100 105 110 
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4~n- -hrr+* ato aat ttc tec gaa gca caa ctt etc 
S K S S P"e S £ K «- « Su £. «. — «• 
115 120 

^ ttt tta eta age ggt aaa aga tgt gca tec age gat ctt 
III Th C r 2 Phe 2 2 i Sr Xflr- -9 Cys Ala Ser Ser Asp Leu 



384 



432 



130 135 

a a a si a a: a a a a a s a = = = 

150 

ct, caa a tg cc« cc 9 tac gag aaa ggc acg acg aaa ego jj* ace ggg 

Leu Gin Met Pro Pro Tyr Glu Lys Gly Thr Tnr uys « ^ 

165 170 

2 a a a s a a a a a a s s a s - 

180 185 

IS S a 2 S = 5 K = K « s i ™ S ~ 

s a a a a - s a - a K = - = a s 



528 



576 



624 



672 



s a a a a a a a a a a s a a a s 
a a s a a a a s a a a a a a i = 

245 250 

c« ,« # n r s J- — *» « s a a a ss 

His Ala Gly Val Asp Asp Val Thr lie ryr 

„. ,,o o« jjj « r jj* jet «. g S - = Kg ?S 
Gly Gly Lys Leu Trp Ser His Axa j/ne y ^ ^ 



275 280 



k i" a ss s a s a a a a a a a a a 



768 



816 



864 



912 



a a a a a a a a a a a a a a a a ••• 



310 315 



305 J 

4. „^ act aac ttg ate tac caa ggc etc cga tac gtg 

s a a a a £ a a a s «. « » «, «. « 

325 330 



1008 



age 1056 



cca cca aag ctg ttc cat cgc gtt tac age 
tgg aaa gee ggg eag cag cca eeg aag g ^ ^ g ^ ^ ^ 

Trp Lys Ala Gly Gin Gin Pro *ro ^y^ ±> 

a a a s a a a a a a a a a a a a 

355 

4-4- act caa gec ttg aaa tea gga gac att 

a a a a a a a a a a a s . - « -» - 



370 " 5 
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agg egg get cat gac tec tgg caa act tgg ctg aac cgt ttc ggg agg 1200 
Arg Arg Ala His Asp Ser Trp Gin Thr Trp Leu Asn Arg Phe Gly Axg 
3B5 390 395 400 

gag tec ttc tct tea gcg ata gag agg ate ttt ctg ggc acg cat cct 1248 
Glu Ser Phe Ser Ser Ala He Glu Arg He Phe Leu Gly Thr His Pro 

405 410 415 

cct ggt ggt gaa aca tgg agt ttc cct cat gat tgg gac eta ttc aag 1296 
Pro Gly Gly Glu Thr Trp Ser Phe Pro His Asp Trp Asp Leu Phe Lys 

420 425 430 

eta atg gga ata gga tct ggc ggg ttt ggt cca gtt ttt gaa age ggg 1344 
Leu Met Gly lie Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser Gly 
435 440 445 

ttt att gag ate ctt cgc ttg gtc ata aac gga tat gaa gaa aat cag 1392 
Phe He Glu lie Leu Arg Leu Val He Asn Gly Tyr Glu Glu Asn Gin 
450 455 460 

egg atg tgc tct gaa gga ate tea gaa ctt cca cgt cga ata gee tct 1440 
Arg Met Cys Ser Glu Gly He Ser Glu Leu Pro Arg Arg He Ala Ser 
465 470 475 480 

caa gtg gtt aac ggt gtg tct gta age cag cgt ata cgc cat gtt caa 1488 
Gin Val Val Asn Gly Val Ser Val Ser Gin Arg He Arg His Val Gin 

485 490 495 

gtc agg gcg att gag aag gaa aag aca aaa ata aag ata agg ctt aag 1536 
Val Arg Ala He Glu Lys Glu Lys Thr Lys He Lys He Arg Leu Lys 

500 505 510 

age ggg ata tct gaa ctt tat gat aag gtg gtg gtt aca tct gga etc 1584 
Ser Gly He Ser Glu Leu Tyr Asp Lys Val Val Val Thr Ser Gly Leu 
515 520 525 

gca aat ate caa etc agg cat tgt ctg aca tgc gat acc acc att ttt 1632 
Ala Asn He Gin Leu Arg His Cys Leu Thr Cys Asp Thr Thr He Phe 
530 535 540 

cgt gca cca gtg aac caa gcg gtt gat aac age cat atg aca ggc teg 1680 
Arg Ala Pro Val Asn Gin Ala Val Asp Asn Ser His Met Thr Gly Ser 
545 550 555 560 

tea aaa etc ttt ctg ctg act gaa cga aaa ttt tgg tta gac cat ate 17 28 
Ser Lys Leu Phe Leu Leu Thr Glu Arg Lys Phe Trp Leu Asp His He 

565 570 575 

etc ccg tec tgt gtc etc atg gac ggg ate gca aaa gca gtg tac tgc 1776 
Leu Pro Ser Cys Val Leu Met Asp Gly He Ala Lys Ala Val Tyr Cys 

580 585 590 

ttg gac tat gag ccg cag gat ccg aat ggt aaa ggt ctg gtg ccc ccc 1824 
Leu Asp Tyr Glu Pro Gin Asp Pro Asn Gly Lys Gly Leu Val Pro Pro 
595 600 605 

act tat aca tgg gag gac gac tec cac aag ctg ttg gcg gtt ccc gac 1872 
Thr Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro Asp 
610 615 620 

aaa aaa gag cga ttc tgt ctg ctg egg gac gca att teg aga tct ttc 1920 
Lys Lys Glu Arg Phe Cys Leu Leu Arg Asp Ala He Ser Arg Ser Phe 
625 630 635 640 

ccg gcg ttt gee cag cat eta gtt cct gee tgc get gat tac gac caa 
Pro Ala Phe Ala Gin His Leu Val Pro Ala Cys Ala Asp Tyr Asp Gin 

645 650 655 



1968 
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aat gtt gtt caa cat gat tgg ctt aca gac gag aat gcc ggg gga get 
Asn Val Val Gin His Asp Trp Leu Thr Asp Glu Asn Ala Gly Gly Ala 

660 665 670 

ttc aaa etc aac egg cgt ggc gag gat ttt tat tct gaa gaa ctt ttc 
Phe Lys Leu Asn Arg Arg Gly Glu Asp Phe Tyr Ser Glu Glu Leu Phe 
675 6B0 685 

ttt caa gcg ctg gac atg cct aat gat acc gga gtt tac ttg gcg ggt 
Phe Gin Ala Leu Asp Met Pro Asn Asp Thr Gly Val Tyr Leu Ala Gly 

690 695 700 

tgc agt tgt tec ttc acc ggt gga tgg gtg gag ggc get att cag acc 
Cys Ser Cys Ser Phe Thr Gly Gly Trp Val Glu Gly Ala lie Gin Thr 
705 710 715 720 

gcg tgt aac gcc gtc tgt gca att ate cac aat tgt gga ggt att ttg 
Ala Cys Asn Ala Val Cys Ala lie lie His Asn Cys Gly Gly lie Leu 

725 730 735 

gca aag gac aat cct etc gaa cac tct tgg aag aga tat aac tac cgc 
Ala Lys Asp Asn Pro Leu Glu His Ser Trp Lys Arg Tyr Asn Tyr Arg 

740 745 750 

aat aga aat taa 
Asn Arg Asn 
755 

<210> 32 
<211> 755 
<212> PRT 

<213> Agrobacterium tumefaciens 

Met°Ser 2 Ala Ser Pro Leu Leu Asp Asn Gin Cys Asp His Phe Ser Thr 
! 5 10 15 

Lys Met Val Asp Leu lie Met Val Asp Lys Ala Asp Glu Leu Asp Arg 

20 25 30 

Arg Val Ser Asp Ala Phe Ser Glu Arg Glu Ala Ser Arg Gly Arg Arg 

35 40 45 

He Thr Gin He Ser Gly Glu Cys Ser Ala Gly Leu Ala Cys Lys Arg 

50 55 60 

Leu Ala Asp Gly Arg Phe Pro Glu He Ser Thr Gly Glu Lys Val Ala 
65 70 75 80 

Ala Leu Ser Ala Tyr He Tyr Val Gly Lys Glu He Leu Gly Arg He 

85 90 95 

Leu Glu Ser Glu Pro Trp Ala Arg Ala Arg Val Ser Gly Leu Val Ala 

100 105 HO 

He Asp Leu Ala Pro Phe Cys Met Asp Phe Ser Glu Ala Gin Leu Leu 

115 120 125 

Gin Thr Leu Phe Leu Leu Ser Gly Lys Arg Cys Ala Ser Ser Asp Leu 

130 135 140 

Ser His Phe Val Ala He Ser He Ser Lys Thr Ala Arg Ser Arg Thr 
145 150 155 160 

Leu Gin Met Pro Pro Tyr Glu Lys Gly Thr Thr Lys Arg Val Thr Gly 

165 170 175 



2016 



2064 



2112 



2160 



2208 



2256 



2268 
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Phe Thr Leu Thr Leu Glu Glu Ala Val Pro Phe Asp Met Val Ala Tyr 

180 185 190 

Gly Arg Asn Leu Met Leu Lys Ala Ser Ala Gly Ser Phe Pro Thr lie 
195 200 205 

Asp Leu Leu Tyr Asp Tyr Arg Ser Phe Phe Asp Gin Cys Ser Asp lie 
210 215 220 

Gly Arg lie Gly Phe Phe Pro Glu Asp Val Pro Lys Pro Lys Val Ala 
225 230 235 240 

lie lie Gly Ala Gly lie Ser Gly Leu Val Val Ala Ser Glu Leu Leu 

245 250 255 

His Ala Gly Val Asp Asp Val Thr lie Tyr Glu Ala Ser Asp Arg Val 

260 265 270 

Gly Gly Lys Leu Trp Ser His Ala Phe Lys Asp Ala Pro Ser Val Val 
275 280 285 

Ala Glu Met Gly Ala Met Arg Phe Pro Pro Ala Ala Ser Cys Leu Phe 
290 295 300 

Phe Phe Leu Glu Arg Tyr Gly Leu Ser Ser Met Arg Pro Phe Pro Asn 
305 310 315 320 

Pro Gly Thr Val Asp Thr Asn Leu Val Tyr Gin Gly Leu Arg Tyr Val 

330 335 



Trp Lys Ala Gly Gin Gin Pro Pro Lys Leu Phe His Arg Val Tyr Ser 

340 345 350 

Gly Trp Arg Ala Phe Leu Arg Asp Gly Phe His Glu Gly Asp lie Val 
355 360 365 

Leu Ala Ser Pro Val Val lie Thr Gin Ala Leu Lys Ser Gly Asp lie 
370 375 380 

Arg Arg Ala His Asp Ser Trp Gin Thr Trp Leu Asn Arg Phe Gly Arg 
385 390 395 400 

Glu Ser Phe Ser Ser Ala lie Glu Arg lie Phe Leu Gly Thr His Pro 

405 410 1 415 

Pro Gly Gly Glu Thr Trp Ser Phe Pro His Asp Trp Asp Leu Phe Lys 

420 425 430 

Leu Met Gly lie Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser Gly 
435 440 445 

Phe lie Glu lie Leu Arg Leu Val lie Asn Gly Tyr Glu Glu Asn Gin 
450 455 460 

Arg Met Cys Ser Glu Gly lie Ser Glu Leu Pro Arg Arg lie Ala Ser 
465 470 475 480 

Gin Val Val Asn Gly Val Ser Val Ser Gin Arg lie Arg His Val Gin 

485 490 495 

Val Arg Ala He Glu Lys Glu Lys Thr Lys He Lys He Arg Leu Lys 

500 505 510 

Ser Gly He Ser Glu Leu Tyr Asp Lys Val Val Val Thr Ser Gly Leu 
515 520 525 

Ala Asn He Gin Leu Arg His Cys Leu Thr Cys Asp Thr Thr He Phe 
530 535 540 
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Arg Ala Pro Val Asn Gin Ala Val Asp Asn Ser His Met Thr Gly Ser 
545 550 

Ser Lys Leu Phe Leu Leu Thr Glu Arg Lys Phe Trp Leu Asp His lie 

565 5 ?° 575 

Leu Pro Ser Cys Val Leu Met Asp Gly He Ala Lys Ala Val Tyr Cys 

580 585 590 

Leu Asp Tyr Glu Pro Gin Asp Pro Asn Gly Lys Gly Leu Val Pro Pro 
595 600 605 

Thr Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro Asp 

610 "5 620 

Lys Lys Glu Arg Phe Cys Leu Leu Arg Asp Ala He Ser Arg ser Phe 

Pro Ala Phe Ala Gin His Leu Val Pro Ala Cys Ala Asp Tyr Asp Gin 

645 650 655 

Asn val val Gin His Asp Trp Leu Thr Asp Glu Asn Ala Gly Gly Ala 

660 665 
Phe Lys Leu Asn Arg Arg Gly Glu Asp Phe Tyr Ser Glu Glu Leu Phe 

675 68° 685 

Phe Gin Ala Leu Asp Met Pro Asn Asp Thr Gly Val Tyr Leu Ala Gly 

690 655 700 

Cys Ser Cys Ser Phe Thr Gly Gly Trp Val Glu Gly Ala He Gin Thr 
705 710 715 

Ala Cys Asn Ala Val Cys Ala He He His Asn Cys Gly Gly He Leu 

725 730 

Ala Lys Asp Asn Pro Leu Glu His Ser Trp Lys Arg Tyr Asn Tyr Arg 
jr jt 745 750 



740 



Asn Arg Asn 
755 



<210> 33 
<211> 1404 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 

<221> CDS 

<222> (1) (1401) 

<223> coding for indoleacetanu.de hydrolase 

^a°atg 3 ccc att acc teg tta gca caa acc eta gaa cgc ctg aga egg 4 8 
52 Va? Pro lie Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 

10 15 



1 5 



aac tac tec tgc tta gaa eta gta gaa act ctg ata gcg cgt tgc 
Lys asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 2 5 30 

caa act gca aaa oca tta aat gee ctt ctg get aca gac tgg gat ggc 
Gin £. Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 
35 4° 45 



96 



144 
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ttg egg cga age 
Leu Arg Arg Ser 
50 

ggt ctt tgc ggc 
Gly Leu Cys Gly 
65 

gta ttt cct aca 
Val Phe Pro Thr 

aag ata cca tec 
Lys lie Pro Ser 

100 

ccg ggt gec teg 
Pro Gly Ala Ser 
115 

aac aac tat gec 
Asn Asn Tyr Ala 
130 

ata cca ggg ggt 
lie Pro Gly Gly 
145 

ttg atg tta ggc 
Leu Met Leu Gly 

ccg gca gec ctg 
Pro Ala Ala Leu 

180 

tat cca aga gat 
Tyr Pro Arg Asp 
195 

gga ate ata gcg 
Gly lie lie Ala 
210 

att tec gga egg 
lie Ser Gly Arg 
225 

egg ate ggc etc 
Arg lie Gly Leu 

gtg gec ttc gca 
Val Ala Phe Ala 

260 

gta ace ttt gtt 
Val Thr Phe Val 
275 

ggg gca agt ttg 
Gly Ala Ser Leu 
290 

aag tat etc gac 
Lys Tyr Leu Asp 
305 



gee aaa aaa aat gat 
Ala Lys Lys Asn Asp 
55 

att cca etc tgt ttt 
lie Pro Leu Cys Phe 
70 

age get get act ccg 
Ser Ala Ala Thr Pro 
85 

cgc gtc gca gaa aga 
Arg Val Ala Glu Arg 

105 

gga aac atg cat gag 
Gly Asn Met His Glu 

120 

acc ggt gcg gtg egg 
Thr Gly Ala Val Arg 
135 

tea age ggt ggt gtg 
Ser Ser Gly Gly Val 
150 

ggc ata ggc acg gat 
Gly lie Gly Thr Asp 
165 

tgt ggc gta gta gga 
Cys Gly Val Val Gly 

185 

egg ata ata ccg ttc 
Arg lie lie Pro Phe 

200 

cag tgc gta gee gat 
Gin Cys Val Ala Asp 
215 

teg gcg aaa att tea 
Ser Ala Lys lie Ser 
230 

ccc act acc tac ttt 
Pro Thr Thr Tyr Phe 
245 

get gaa acg acg att 
Ala Glu Thr Thr lie 

265 

gaa gec gac ate ccc 
Glu Ala Asp lie Pro 

280 

cca att gcg ctt tac 
Pro lie Ala Leu Tyr 
295 

gat ttt gtg gga aca 
Asp Phe Val Gly Thr 
310 



49 

cgt cat gga aac 
Arg His Gly Asn 

60 

aag gcg aac ate 
Lys Ala Asn lie 
75 

gcg ctg ata aac 
Ala Leu lie Asn 
90 

ctt ttt tea get 
Leu Phe Ser Ala 

tta teg ttt gga 
Leu Ser Phe Gly 

125 

aac ccg tgg aat 
Asn Pro Trp Asn 
140 

get get gcg gtg 
Ala Ala Ala Val 
155 

acc ggt gca tct 
Thr Gly Ala Ser 
170 

ttt cga ccg acg 
Phe Arg Pro Thr 

age ccc acc egg 
Ser Pro Thr Arg 

205 

gtt ata ate etc 
Val lie He Leu 
220 

ccc atg ccg ctg 
Pro Met Pro Leu 
235 

tac gat gac ctt 
Tyr Asp Asp Leu 
250 

cgc ttg eta gec 
Arg Leu Leu Ala 

cac eta gag gaa 
His Leu Glu Glu 

285 

gaa ttt cca cac 
Glu Phe Pro His 
300 

gtt tct ttt tct 
Val Ser Phe Ser 
315 



gec gga tta 192 
Ala Gly Leu 

gcg acc ggc 240 
Ala Thr Gly 

80 

cac ttg cca 288 
His Leu Pro 
95 

gga gca ctg 336 

Gly Ala Leu 

110 

att acg age 384 
He Thr Ser 

cca agt ctg 432 
Pro Ser Leu 

gca age cga 480 
Ala Ser Arg 
160 

gtt cgc eta 528 
Val Arg Leu 
175 

ctt ggt cga 576 

Leu Gly Arg 

190 

gac acc gec 624 
Asp Thr Ala 

gac cag gtg 67 2 
Asp Gin Val 

aag ggg ctt 720 
Lys Gly Leu 
240 

gat get gat 768 
Asp Ala Asp 
255 

aac aga ggc 816 

Asn Arg Gly 

270 

ttg aac agt 864 
Leu Asn Ser 

get eta aaa 912 
Ala Leu Lys 

gac gtt ate 960 
Asp Val He 
320 
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aaa gga att cgt age ccc gat gta gcg aac att gtc agt gcg caa att 
Lys *J y Ile Arg ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 

■3 *2i 

aat egg cat caa att tec aac gat gaa tat gaa ctg gcg cgt caa tec 
Ig hS Gin lie Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 
ttc agg cca agg etc cag gee act tat egg aat tac ttc aga etc tat 
til III Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 

355 360 365 

cag tta gat gca ate ctt ttc cca act gca ccc tta gcg gec aaa gec 
Gin Leu Asp Ala lie Leu Pbe Pro Thr Ala Pro Leu Ala Ala Lys Ala 

370 375 3»° 

ata oat cag gag teg tea gtc ate cac aat ggc tea atg atg aac act 
Se o£ Gin III Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

ttc aag ate tac gtg ega aat gtg gac cea age age aac gca ggc eta 
Phe Lys Ile Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

ect ggg ttg age ctt cct gee tgc ctt aca cct gat cgc ttg cct gtt 
Pro III Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 

420 425 
gga atg gaa att gat gga tta geg ggg tea gac cac cgt ctg tta gca 
Gly Met Glu Ile Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 

1 435 440 4 « 

ate ggg gca gca tta gaa aaa get ata aat ttt tct tec ttt ccc gat 
lie III La Ala Leu Glu Lys Ala lie Asn Phe Ser Ser Phe Pro Asp 
450 4 55 460 

get ttt aat tag 
Ala Phe Asn 
465 

<210> 34 
<211> 467 
<212> PRT 

<2i3> Agrobacterium tumefaciens 

Het°val 4 pro Ile Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 

! 5 10 15 

Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 25 
Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 
35 40 « 

Leu Arg Arg Ser Ala Lys Lys Asn Asp Arg His Gly Asn Ala Gly Leu 

50 55 60 

Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn Ile Ala Thr Gly 

65 70 75 

Val Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu Ile Asn His Leu Pro 

65 90 
Lys lie Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

100 105 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1404 
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Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly lie Thr Ser 
115 120 125 

Asn Asn Tyr Ala Thr Gly Ala Val Axg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

Leu Met Leu Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Gly Arg 

180 165 190 

Tyr Pro Arg Asp Arg lie lie Pro Phe Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly lie He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 
210 215 220 

lie Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 

230 235 240 



Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 255 

* 

Val Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 

260 265 270 

Val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 

325 330 335 

Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 

340 345 350 

Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 

He Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 

420 425 430 

Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Ser Ser Phe Pro Asp 
450 455 460 

Ala Phe Asn 
465 
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<210> 35 

<211> 1419 

<212> DNA 

<213> Agrobacterium vitis 

<220> 

<221> CDS 

<222> (1)..(1416) 

<223> coding for indoleacet amide hydrolase 

<400> 35 4 
atg gtg acc eta ggt tea ate aag gaa acc ctg gaa tgt etc agg ctg 
Met Val Thr Leu GXy Ser He Lys Glu Thr Leu Glu Cys Leu Arg Leu 

x 5 10 15 

aaa aaa tac tec tgt tec gaa ctg get gaa acc ata ata gec cgt tgc 
Lys Lys Tyr Ser Cys Ser Glu Leu Ala Glu Thr He He Ala Arg Cys 

20 25 30 

gaa gee gcg aaa tct etc aat get ctt ctg gcg act gac tgg gat tac 
Ilu Ala Ala Lys Ser Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Tyr 
35 40 45 

ctg egg cgt aat gee aag aaa gta gat gaa gat gga age gee ggc gag 
Leu Arg Arg Asn Ala Lys Lys Val Asp Glu Asp Gly Ser Ala Gly Glu 

50 55 60 

ggt ctt gee ggc ate eeg ctg tgt tet aaa gcg aae att gea aca ggc 
Gly Leu Ala Gly He Pro Leu Cys Ser Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

ata ttc eca gea age gcg gee acg ccg gcg ctt gat gaa cat tta cct 
He Phe Pro Ala Ser Ala Ala Thr Pro Ala Leu Asp Glu His Leu Pro 

85 90 95 

aea aca cca gee ggc gtc cgt aaa ccg ctt eta gac get ggg gea ctg 
Thr Thr Pro Ala Gly Val Arg Lys Pro Leu Leu Asp Ala Gly Ala Leu 

100 105 11° 

ata ggc get teg gga aac atg eat gag tta teg ttt ggc att acc agt 
He III Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 1" 

aac aac eae gec act ggt gcg gtg aga aac ceo tgg aat ccc age tta 
Asn Asn His Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

ata cca gga ggc teg age ggc ggc gtg get get get gta gea tea egg 
lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

tta atg etc ggc gga att ggc acc gac acg ggg get teg gtc cgc eta 
Leu Met Leu Giy Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

cct gea tee eta tgt ggc gta gtg gga ttc cgc ccg acg ate ggc aga 
Pro Ala Ser Leu Cys Gly Val Val Gly Phe Arg Pro Thr lie Gly Arg 

160 185 190 

tat cct gga gac cga att gtg ccg gtt age ccc ace cgc gat aca gee 
Tyr Pro Sly Asp Arg He Val Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



CA 02493364 2005-01-21 



PF 53790 



gga att ate gca cag age gtt cct gat gtg ata etc ctt gac caa ate 672 
Gly lie lie Ala Gin Ser Val Pro Asp Val He Leu Leu Asp Gin He 
210 215 220 

att tgc ggg aag etc acg ace cac caa cct gta ccc ctg gag gga tta 720 
He Cys Gly Lys Leu Thr Tbx His Gin Pro Val Pro Leu Glu Gly Leu 
225 230 235 240 

cgt ate ggc ttg cca ace act tac fctt tac gat gac ctt gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 255 

gtg gee ttc gca get gaa aac ctt ate acg ctg ctg gee age aag ggt 816 
Val Ala Phe Ala Ala Glu Asn Leu He Thr Leu Leu Ala Ser Lys Gly 

260 265 270 

gta acc ttt gtt aag gee gag att cca gat ctg cag cgt ctg aac ate 864 
Val Thr Phe Val Lys Ala Glu He Pro Asp Leu Gin Arg Leu Asn He 
275 280 285 

ggg gtt age ttt cct att gee ctg tac gag ttt ccg ttc gee eta caa 912 
Gly Val Ser Phe Pro He Ala Leu Tyr Glu Phe Pro Phe Ala Leu Gin 
290 295 300 

aag tat ate gat gac ttt gtg aag gat gtg tct ttt tct gac gtc ate 960 
Lys Tyr lie Asp Asp Phe Val Lys Asp Val Ser Phe Ser Asp Val He 
305 310 315 320 

aaa gga att cgt age cct gat. gta gee aac att gee aat get caa att 100 8 
Lys Gly He Arg Ser Pro Asp Val Ala Asn lie Ala Asn Ala Gin He 

325 330 335 

gat gga cat caa att tec aaa get tea tat gaa ctg gcg cga caa tct 1056 
Asp Gly His Gin He Ser Lys Ala Ser Tyr Glu Leu Ala Arg Gin Ser 

340 345 350 

ttc aga cca aag ctg caa gee gee tac cat gat tac ttc aag ctg cac 1104 
Phe Arg Pro Lys Leu Gin Ala Ala Tyr His Asp Tyr Phe Lys Leu His 
355 360 365 

cag eta gac gcg ate ctt ttc ccg aca get ccc ctg aca gee aaa ccg 1152 
Gin Leu Asp Ala lie Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

ate ggc caa gat tta teg gtg atg cac aat ggc gta atg gec gac acg 1200 
He Gly Gin Asp Leu Ser Val Met His Asn Gly Val Met Ala Asp Thr 
385 390 395 400 

ttt aaa ate ttc gtg cga aat gtg gat ccg ggg age aac gca ggc ctg 1248 
Phe Lys He Phe Val Arg Asn Val Asp Pro Gly Ser Asn Ala Gly Leu 

405 410 415 

cca gga tta age ctt ccc gtt tct ctt act tea aag ggt ttg cct att 1296 
Pro Gly Leu Ser Leu Pro Val Ser Leu Thr Ser Lys Gly Leu Pro He 

420 425 430 

gga atg gaa ate gat gga tta gcg ggc atg gac gac cgt ttg eta gca 1344 
Gly Met Glu He Asp Gly Leu Ala Gly Met Asp Asp Arg Leu Leu Ala 
435 440 445 

ate gga gcg gca eta gag gaa gcg ata get ttt cat aat tta cct gac 139 2 
He Gly Ala Ala Leu Glu Glu Ala He Ala Phe His Asn Leu Pro Asp 
450 455 460 

ttc ccg aaa gtc gag aca aac tac tga 1419 
Phe Pro Lys Val Glu Thr Asn Tyr 
465 470 
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<210> 36 
<211> 472 
<212> PRT 

<213> Agrobacterium vitis 

2T^i 6 ttr Leu Gly Ser He Lys Glu Thr Leu Glu Cy S Leu Arg leu 
! 5 19 



Lys Lys Tyr Ser Cys Ser Glu Leu Ala Glu Thr He He Ala Arg Cys 

20 2S 

Glu Ala Ala Lys Ser Leu Asa Ala Leu Leu Ala Thr Asp Trp Asp Tyr 



35 

Leu Arg Arg Asn Ala Lys Lys Val Asp Glu Asp Gly Ser Ala Gly Glu 
50 55 



Gly Leu Ala Gly He Pro Leu Cys Ser Lys Ala Asn He Ala Thr Gly 

65 70 75 

He Phe Pro Ala Ser Ala Ala Thr Pro Ala Leu Asp Glu His Leu Pro 

85 90 



Thr Thr Pro Ala Gly Val Arg Lys Pro Leu Leu Asp Ala Gly Ala Leu 
He Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 



Asn Asn His Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 



130 «5 140 



He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 

Leu Ket Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 



Pro Ala Ser Leu Cys Gly Val val Gly Phe Arg Pro Thr He Gly Arg 

180 185 
Tyr Pro Gly Asp Arg He Val Pro Val Ser Pro Thr Arg Asp Thr Ala 

Gly He He Ala Gin Ser Val Pro Asp Val He Leu Leu Asp Gin He 

215 220 



210 



He Cys Gly Lys Leu Thr Thr His Gin Pro Val Pro Leu Glu Gly Leu 
225 230 

Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

2 5 0 



245 



Val Ala Phe Ala Ala Glu Asix Leu He Thr Leu Leu Ala Ser Lys Gly 

260 265 
val Thr Phe Val Lys Ala Glu lie Pro Asp Leu Gin Arg Leu Asn He 

275 280 285 

Gly val Ser Phe Pro He Ala Leu Tyr Glu Phe Pro Phe Ala Leu Gin 

290 295 300 

Lys Tyr He Asp Asp Phe Val Lys Asp Val Ser Phe Ser Asp Val lie 

305 310 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Ala Asn Ala Gin He 

325 330 



♦ 
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Asp Gly His Gin 

340 

Phe Arg Pro Lys 
355 

Gin lieu Asp Ala 
370 

He Gly Gin Asp 
385 

Phe Lys lie Phe 

Pro Gly Leu Ser 

420 

Gly Met Glu lie 
435 

lie Gly Ala Ala 
450 

Phe Pro Lys Val 
465 



lie Ser Lys Ala 

Leu Gin Ala Ala 

360 

lie Leu Phe Pro 
375 

Leu Ser Val Met 
390 

Val Arg Asn Val 
405 

Leu Pro Val Ser 



Asp Gly Leu Ala 

440 

Leu Glu Glu Ala 
455 

Glu Thr Asn Tyr 
470 



55 

Ser Tyr Glu Leu 
345 

Tyr His Asp Tyr 



Thr Ala Pro Leu 

380 

His Asn Gly Val 
395 

Asp Pro Gly Ser 
410 

Leu Thr Ser Lys 
425 

Gly Met Asp Asp 

lie Ala Phe His 

460 



Ala Arg Gin Ser 
350 

Phe Lys Leu His 
365 

Thr Ala Lys Pro 

Met Ala Asp Thr 

400 

Asn Ala Gly Leu 
415 

Gly Leu Pro Xle 
430 

Arg Leu Leu Ala 
445 

Asn Leu Pro Asp 



48 



96 



<210> 37 
<211> 1263 
<212> DNA 

<213> Arabidopsis thaliana 

<220> 
<221> CDS 
<222> (1}..(1260) 

<223> coding for 5-methylthioribose kinase 
<400> 37 

atg tct ttt gag gag ttt acg ccg tta aac gag aag tct ctt gta gac 
Met Ser Phe Glu Glu Phe Thr Pro Leu Asn Glu Lys Ser Leu Val Asp 
15 10 15 

tac ate aag tea aca cct get etc tct tec aag ate gga gee gac aag 
Tyr lie Lys Ser Thr Pro Ala Leu Ser Ser Lys He Gly Ala Asp Lys 

20 25 30 

tec gat gat gat ttg gtt ate aaa gaa gtt gga gat ggc aat etc aat 
Ser Asp Asp Asp Leu Val lie Lys Glu Val Gly Asp Gly Asn Leu Asn 
35 40 45 

ttc gtt ttc ate gtt gtt gga tec tct ggt tct ctt gtc ate aaa cag 
Phe Val Phe He Val Val Gly Ser Ser Gly Ser Leu Val lie Lys Gin 
50 55 60 

get ctt cca tat att cgc tgt ate ggt gaa tea tgg cca atg acg aaa 
Ala Leu Pro Tyr lie Arg Cys lie Gly Glu Ser Trp Pro Met Thr Lys 
65 70 75 80 

gaa aga get tat ttt gaa gca aca act ttg aga aag cat gga aat tta 
Glu Arg Ala Tyr Phe Glu Ala Thr Thr Leu Arg Lys His Gly Asn Leu 

85 90 95 

tea cct gat cat gtt cct gaa gtc tac cat ttt gac aga aca atg gcg 336 
Ser Pro Asp His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala 

100 105 110 



144 



192 



240 



2S8 
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tta att gga atg aga tac ctt gag cct cct cat ate att etc cgc aaa 
Leu He 15 Me! Arg Tyr Leu Glu Pro Pro His He lie Leu Arg Lys 

115 120 
gga etc att get ggg att gag tat cct ttc etc gca gac cac atg tct 
Gly Leu He Ala Gly He Glu Tyr Pro Phe Leu Ala Asp His Met Ser 



130 135 1^0 

o« t tac ata gcg aag act etc ttc ttc act tct etc etc tat cac gat 
Si Tyr Me? Ill Lys Thr Leu Phe Phe Thx Ser Leu Leu Tyr His Asp 



384 



432 



480 



145 



150 155 



acc aca gag cac aga aga gca gta acc gaa ttt tgt ggt aat gtg gag 
Thr Thr Glu His Arg Arg Ala Val Thr Glu Phe Cys Gly Asn val Glu 

165 170 
tta tgc cga tta acg gag caa gtt gtg ttt teg gac cca tat aga gtt 
HI Cys Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val 

180 185 
tec aea ttt aat cgt tgg act tea cct tat ctt gat gat gat get aag 
Ser Thr Phe Asn Arg Trp Thr Ser Pro Tyr Leu Asp Asp Asp Ala Lys 

195 200 205 

act gtg cgc gaa gac agt gee ttg aag etc gaa ate gca gag eta aaa 
lit III Arg Glu Asp Ser Ala Leu Lys Leu Glu lie Ala Glu Leu Lys 

210 215 220 

+ m »ta ttc tat gaa aga get caa get tta ata cat ggt gat ctt cat 
til Me? Phe cys Glu Arg Ala Gin Ala Leu lie His Gly Asp Leu Hxs 
225 230 

»^ aat tct ate atg gtt act caa gat tea acg caa gtt ata gat cca 
Thr !£ Val Ke't Val Thr Gin Asp Ser Thr Gin Val He Asp Pro 

245 250 
nan + t t tca ttc tat gga ccg atg ggt ttc gat att ggc get tat ctt 
S2 Phe Ser Hi £r Sy Pro Met Gly Phe Asp Xle Gly Ala Tyr Leu 

260 265 
atrt aac tta ata eta get ttc ttt gca caa gat gga cac gec act cag 
SS "n III HI Leu La Phe Phe Ala Gin Asp Gly His Ala Thr Gin 

1 275 280 

aaa aat aat cga aaa gaa tac aag cag tgg ate ttg aga acc att gag 
2S Asn Asp Arg Lys Glu Tyr Lys Gin Trp He Leu Arg Thr He Glu 

290 295 300 

caa act tgg aat ttg ttt aac aaa agg ttc att gcg eta tgg gat caa 
Gin Tnr %% Asn Leu Phe Asn Lys Arg Phe lie Ala Leu Trp Asp Gin 
305 310 31b 

... aaa aat aga cca ggc gaa gca tac ctt gca gat ate tat aac aat 
"n Lys Asp Gly Pro Sy Glu Ala Tyr Leu Ala Asp He Tyr Asn Asn 

325 330 
rr++ +<tcr aaa ttt gtt caa gaa aac tac atg agg aat ttg ttg 
S£ Glu !S Leu Lys Si Gin Glu Asn Tyr Met Arg Asn Leu Leu 

340 345 350 

cat gac tca etc gga ttc ggc get gca aag atg att agg aga att gtg 
h!s Asp ser Leu Ily Phe Gly Ala Ala Lys Met He Arg Arg He Val 
355 360 365 

„= „+ n „na cat att gag gac ttt gaa tca ate gaa gaa gat aag cga 
lly III K His Glu Lp Phe Glu Ser lie Glu Glu Asp Lys Arg 

370 375 380 
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aga get att tgc gag aga agt gca etc 
Arg Ala lie Cys Glu Arg Ser Ala lieu 
385 390 

aag gaa agg aga aag ttt aag agt ate 
Lys Glu Arg Arg Lys Phe Lys Ser lie 

405 

caa caa caa age taa 
Gin Gin Gin Ser 

420 



57 

gag ttt gcg aag atg ctt etc 1200 
Glu Phe Ala Lys Met Leu Leu 
395 400 

ggt gaa gtt gtt tea gca att 1248 
Gly Glu Val Val Ser Ala lie 
410 415 

1263 



<210> 38 
<211> 420 
<212> PRT 

<213> Arabidopsis thallana 
<400> 38 

Met Ser Phe Glu Glu Phe Thr Pro Leu Asn Glu Lys Ser Leu Val Asp 
15 10 15 

Tyr He Lys Ser Thr Pro Ala Leu Ser Ser Lys He Gly Ala Asp Lys 

20 25 30 

Ser Asp Asp Asp Leu Val He Lys Glu Val Gly Asp Gly Asn Leu Asn 
35 40 45 

Phe Val Phe Zle Val Val Gly Ser Ser Gly Ser Leu Val lie Lys Gin 
50 55 60 

Ala Leu Pro Tyr lie Arg Cys lie Gly Glu Ser Trp Pro Met Thr Lys 
65 70 75 80 

Glu Arg Ala Tyr Phe Glu Ala Thr Thr Leu Arg Lys His Gly Asn Leu 

85 90 95 

Ser Pro Asp His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala 

100 105 110 

Leu He Gly Met Arg Tyr Leu Glu Pro Pro His He He Leu Arg Lys 
115 120 125 

Gly Leu He Ala Gly He Glu Tyr Pro Phe Leu Ala Asp His Met Ser 
130 135 140 

Asp Tyr Met Ala Lys Thr Leu Phe Phe Thr Ser Leu Leu Tyr His Asp 
145 150 155 160 

Thr Thr Glu His Arg Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu 

165 170 175 

Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val 

180 185 190 

Ser Thr Phe Asn Arg Trp Thr Ser Pro Tyr Leu Asp Asp Asp Ala Lys 
195 200 205 

Ala Val Arg Glu Asp Ser Ala Leu Lys Leu Glu He Ala Glu Leu Lys 
210 215 220 

Ser Met Phe Cys Glu Arg Ala Gin Ala Leu He His Gly Asp Leu His 
225 230 235 240 

Thr Gly Ser Val Met Val Thr Gin Asp Ser Thr Gin Val He Asp Pro 

245 250 255 
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Glu Phe Ser Phe Tyr Gly Pro Met Gly Phe Asp lie Gly Ala Tyr Leu 

260 265 270 

Gly Asn Leu lie Leu Ala Phe Phe Ala Gin Asp Gly His Ala Thr Gin 
275 280 285 

Glu Asn Asp Arg Lys Glu Tyr Lys Gin Trp He Leu Arg Thr He Glu 
290 295 300 

Gin Thr Trp Asn Leu Phe Asn Lys Arg Phe He Ala Leu Trp Asp Gin 
305 310 315 320 

Asn Lys Asp Gly Pro Gly Glu Ala Tyr Leu Ala Asp He Tyr Asn Asn 

325 330 335 

Thr Glu Val Leu Lys Phe Val Gin Glu Asn Tyr Met Arg Asn Leu Leu 

340 345 350 

His Asp Ser Leu Gly Phe Gly Ala Ala Lys Met He Arg Arg He Val 
355 360 365 

Gly Val Ala His Val Glu Asp Phe Glu Ser He Glu Glu Asp Lys Arg 
370 375 380 

Ara Ala He Cys Glu Arg Ser Ala Leu Glu Phe Ala Lys Met Leu Leu 
385 390 395 400 

Lys Glu Arg Arg Lys Phe Lys Ser He Gly Glu Val Val Ser Ala He 

405 410 415 

Gin Gin Gin Ser 

420 



<210> 39 
<211> 1200 
<212> DNA 

<213> Klebsiella pneumoniae 

<220> 

<221> CDS 

<222> (1)..(1197) 

<223> coding for 5 -methyl thioribose kinase 

<400> 39 t A 

atg teg caa tac cat acc ttc acc gec cac gat gec gtg get tac gcg 
Met Ser Gin Tyr His Thr Phe Thr Ala His Asp Ala Val Ala Tyr Ala 
1 5 1° 15 

caa cag ttc gec ggc ate gac aac cca tct gag ctg gtc age gcg cag 
Gin Gin Phe Ala Gly He Asp Asn Pro Ser Glu Leu Val Ser Ala Gin 

20 25 30 

gaa gtg ggc gat ggc aac etc aat ctg gtg ttt aaa gtg ttc gat cgt 
Glu Val Gly Asp Gly Asn Leu Asn Leu Val Phe Lys Val Phe Asp Arg 
35 40 45 

cag ggc gtc age egg gcg ate gtc aaa cag gec ctg ccc tac gtg cgc 
Gin Gly Val Ser Arg Ala He Val Lys Gin Ala Leu Pro Tyr Val Arg 
50 55 60 

tgc gtc ggc gaa tec tgg ccg ctg acc etc gac cgc gec cgt etc gaa 
Cys Val Gly Glu Ser Trp Pro Leu Thr Leu Asp Arg Ala Arg Leu Glu 
65 70 75 80 
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59 



gcg cag acc ctg gtc gcc cac tat cag cac age ccg cag cac acg gta 
Ala Gin Thr Leu Val Ala His Tyr Gin His Ser Pro Gin His Thr Val 

85 90 95 

aaa ate cat cac ttt gat ccc gag ctg gcg gtg atg gtg atg gaa gat 
Lys lie His His Phe Asp Pro Glu Leu Ala Val Met Val Met Glu Asp 

100 105 110 

ctt tec gac cac cgc ate tgg cgc gga gag ctt ate get aac gtc tac 
Leu Ser Asp His Arg lie Trp Arg Gly Glu Leu lie Ala Asn Val Tyr 
115 120 125 

tat ccc cag gcg gcc cgc cag ctt ggc gac tat ctg gcg cag gtg ttg 
Tyr Pro Gin Ala Ala Arg Gin Leu Gly Asp Tyr Leu Ala Gin Val Leu 
130 135 140 

ttc cac acc age gat ttc tac etc cat ccc cac gag aaa aag gcg cag 
Phe His Thr Ser Asp Phe Tyr Leu His Pro His Glu Lys Lys Ala Gin 
145 150 155 160 

gtg gcg cag ttt att aac ccg gcg atg tgc gag ate acc gag gat ctg 
Val Ala Gin Phe lie Asn Pro Ala Met Cys Glu lie Thr Glu Asp Leu 

165 170 175 

ttc ttt aac gac ccg tat cag ate cac gag cgc aat aac tac ccg gcg 
Phe Phe Asn Asp Pro Tyr Gin lie His Glu Arg Asn Asn Tyr Pro Ala 

180 185 190 

gag ctg gag gcc gat gtc gcc gcc ctg cgc gac gac gcc cag ctt aag 
Glu Leu Glu Ala Asp Val Ala Ala Leu Arg Asp Asp Ala Gin Leu Lys 
195 200 205 

ctg gcg gtg gcg gcg ctg aag cac cgt ttc ttt gcc cat gcg gaa gcg 
Leu Ala Val Ala Ala Leu Lys His Arg Phe Phe Ala His Ala Glu Ala 
210 215 220 

ctg ctg cac ggc gat ate cac age ggg teg ate ttc gtt gcc gaa ggt 
Leu Leu His Gly Asp lie His Ser Gly Ser lie Phe Val Ala Glu Gly 
225 230 235 240 

age ctg aag gcc ate gac gcc gag ttc ggc tac ttc ggc ccc ate ggc 
Ser Leu Lys Ala lie Asp Ala Glu Phe Gly Tyr Phe Gly Pro lie Gly 

245 250 255 

ttc gat ate ggc acc gcc ate ggc aac ctg ctg ctg aac tac tgc ggc 
Phe Asp lie Gly Thr Ala lie Gly Asn Leu Leu Leu Asn Tyr Cys Gly 

260 265 270 



288 



ctg ccg ggc cag etc ggc att cgc 
Leu Pro Gly Gin Leu Gly lie Arg 
275 280 



gcc gcc gcc gcg cgc gag cag 
Ala Ala Ala Ala Arg Glu Gin 

285 



egg ctg aac gac ate cac cag ctg tgg acc acc ttc gcc gag cgc ttc 
Arg Leu Asn Asp lie His Gin Leu Trp Thr Thr Phe Ala Glu Arg Phe 
290 295 300 

cag gcg ctg gcg gcg gag aaa acc cgc gac gcg gcg ctg get tac ccc 
Gin Ala Leu Ala Ala Glu Lys Thr Arg Asp Ala Ala Leu Ala Tyr Pro 
305 310 315 320 

ggc tac gcc tec gcc ttt ctg aag aaa gtc tgg gcg gac gcg gtc ggc 
Gly Tyr Ala Ser Ala Phe Leu Lys Lys Val Trp Ala Asp Ala Val Gly 

325 330 335 

ttc tgc ggc age gaa ctg ate cgc cgc age gtc gga ctg teg cac gtc 
Phe Cys Gly Ser Glu Leu He Arg Arg Ser Val Gly Leu Ser His Val 

340 345 350 
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1200 



60 

gcg gat ate gac act ate cag gac gac gec atg cgt cat gag tgc ctg 1104 
Ala Asp lie Asp Thr lie Gin Asp Asp Ala Met Arg His Glu Cys Leu 

355 360 365 

cgc cac gec att acc ctg ggc aga gcg ctg ate gtg ctg gee gag cgt 1152 
Arg His Ala lie Thr Leu Gly Arg Ala Leu lie Val Leu Ala Glu Arg 

370 375 380 

ate gac age gtc gac gag ctg ctg gcg egg gta cgc cag tac age tga 
lie Asp Ser Val Asp Glu Leu Leu Ala Arg Val Arg Gin Tyr Ser 
385 390 395 

<210> 40 
<211> 399 
<212> PRT 

<213> Klebsiella pneumoniae 
<400> 40 

Met Ser Gin Tyr His Thr Phe Thr Ala His Asp Ala Val Ala Tyr Ala 
1 5 10 15 

Gin Gin Phe Ala Gly lie Asp Asn Pro Ser Glu Leu Val Ser Ala Gin 

20 25 30 

Glu Val Gly Asp Gly Asn Leu Asn Leu Val Phe Lys Val Phe Asp Arg 
35 40 45 

Gin Gly Val Ser Arg Ala lie Val Lys Gin Ala Leu Pro Tyr Val Arg 

50 55 60 

Cys val Gly Glu Ser Trp Pro Leu Thr Leu Asp Arg Ala Arg Leu Glu 
65 70 75 80 

Ala Gin Thr Leu Val Ala His Tyr Gin His Ser Pro Gin His Thr Val 

85 90 95 

Lys He His His Phe Asp Pro Glu Leu Ala Val Met Val Met Glu Asp 

100 105 HO 

Leu Ser Asp His Arg He Trp Arg Gly Glu Leu He Ala Asn Val Tyr 

115 120 125 

Tyr Pro Gin Ala Ala Arg Gin Leu Gly Asp Tyr Leu Ala Gin Val Leu 
130 135 140 

Phe His Thr Ser Asp Phe Tyr Leu His Pro His Glu Lys Lys Ala Gin 
145 150 155 160 

Val Ala Gin Phe He Asn Pro Ala Met Cys Glu He Thr Glu Asp Leu 

165 170 175 

Phe Phe Asn Asp Pro Tyr Gin He His Glu Arg Asn Asn Tyr Pro Ala 

180 185 190 

Glu Leu Glu Ala Asp Val Ala Ala Leu Arg Asp Asp Ala Gin Leu Lys 
195 200 205 

Leu Ala Val Ala Ala Leu Lys Bis Arg Phe Phe Ala His Ala Glu Ala 

210 215 220 

Leu Leu His Gly Asp He His Ser Gly Ser He Phe Val Ala Glu Gly 
22 5 230 235 240 

Ser Leu Lys Ala He Asp Ala Glu Phe Gly Tyr Phe Gly Pro He Gly 

245 250 255 

Phe Asp He Gly Thr Ala He Gly Asn Leu Leu Leu Asn Tyr Cys Gly 

260 265 270 
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Leu Pro Gly Gin Leu Gly 
275 

Arg Leu Asn Asp Xle His 
290 

Gin Ala Leu Ala Ala Glu 
305 310 

Gly Tyr Ala Ser Ala Phe 

325 

Phe Cys Gly Ser Glu Leu 

340 

Ala Asp lie Asp Thr- lie 
355 

Arg His Ala He Thr Leu 
370 

He Asp Ser val Asp Glu 
385 390 



61 

lie Arg Asp Ala Ala 
280 

Gin Leu Trp Thr Thr 
295 

Lys Thr Arg Asp Ala 

315 

Leu Lys Lys Val Trp 

330 

He Arg Arg Ser Val 
345 

Gin Asp Asp Ala Met 
360 

Gly Arg Ala Leu lie 
375 

Leu Leu Ala Arg Val 

395 



Ala Ala Arg Glu Gin 
285 

Phe Ala Glu Arg Phe 
300 

Ala Leu Ala Tyr Pro 

320 

Ala Asp Ala Val Gly 

335 

Gly Leu Ser His Val 
350 

Arg His Glu Cys Leu 
365 

Val Leu Ala Glu Arg 
380 

Arg Gin Tyr Ser 



<210> 41 
<211> 1140 
<212> DNA 

<213> Arabidopsis "thaliana 

<220> 

<221> CDS 

<222> (1)..(1137) 

<223> coding for alcohol dehydrogenase 
<400> 41 

atg tot acc acc gga cag att att cga tgc aaa get get gtg gca tgg 
Met Ser Thr Thr Gly Gin lie lie Arg Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gaa gec gga aag cca ctg gtg ate gag gaa gtg gag gtt get cca ccg 
Glu Ala Gly Lys Pro Leu Val lie Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

cag aaa cac gaa gtt cgt ate aag att etc ttc act tct etc tgt cac 
Gin Lys His Glu Val Arg He Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gat gtt tac ttc tgg gaa get aag gga caa aca ccg ttg ttt cca 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Leu Phe Pro 
50 55 60 

cgt ate ttc ggc cat gaa get gga ggg att gtt gag agt gtt gga gaa 
Arg He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

gga gtg act gat ctt cag cca gga gat cat gtg ttg ccg ate ttt acc 
Gly Val Thr Asp Leu Gin Pro Gly Asp His Val Leu Pro He Phe Thr 

85 90 95 

gga gaa tgt gga gat tgt cgt cat tgc cag teg gag gaa tea aac atg 
Gly Glu Cys Gly Asp Cys Arg His Cys Gin Ser Glu Glu Ser Asn Met 

100 105 110 

tgt gat ctt etc agg ate aac aca gag cga gga ggt atg att cac gat 
Cys Asp Leu Leu Arg He Asn Thr Glu Arg Gly Gly Met He His Asp 
115 "* 120 125 
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ggt gaa tct aga ttc tec att aat ggc aaa cca ate tac cat ttc ctt 432 
Gly Glu Ser Arg Phe Ser lie Asn Gly Lys Pro He Tyr His Phe Leu 
130 135 140 

ggg acg tec acg ttc agt gag tac act gtg gtt cac tct ggt cag gtc 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Val His Ser Gly Gin Val 
145 150 155 160 

get aag ate aat ccg gat get cct ctt gac aag gtc tgt att gtc agt 
Ala Lys lie Asn Pro Asp Ala Pro Leu Asp Lys Val Cys He Val Ser 

165 170 175 

tgt ggt ttg tct act ggg tta gga gca act ttg aat gtg get aaa ccc 
Cys Gly Leu Ser Thr Gly Leu Gly Ala Thr Leu Asn Val Ala Lys Pro 

180 185 190 

aag aaa ggt caa agt gtt gee att ttt ggt ctt ggt get gtt ggt tta 
Lys Lys Gly Gin Ser Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

ggc get gca gaa ggt get aga ate get ggt get tct agg ate ate ggt 672 
Gly Ala Ala Glu Gly Ala Arg lie Ala Gly Ala Ser Arg He He Gly 
210 215 220 

gtt gat ttt aac tct aaa aga ttc gac caa get aag gaa ttc ggt gtg 
Val Asp Phe Asn Ser Lys Arg Phe Asp Gin Ala Lys Glu Phe Gly Val 
225 230 235 240 

ace gag tgt gtg aac ccg aaa gac cat gac aag cca att caa cag gtg 
Thr Glu Cys Val Asn Pro Lys Asp His Asp Lys Pro He Gin Gin Val 

245 250 255 

ate get gag atg acg gat ggt ggg gtg gac agg agt gtg gaa tgc acc 
He Ala Glu Met Thr Asp Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

gga age gtt cag gee atg att caa gca ttt gaa tgt gtc cac gat ggc 
Gly Ser Val Gin Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

fcgg 99* gtt gca gtg ctg gtg ggt gtg cca age aaa gac gat gec ttc 912 
Trp Gly Val Ala Val Leu Val Gly Val Pro Ser Lys Asp Asp Ala Phe 
290 295 300 

aag act cat ccg atg aat ttc ttg aat gag agg act ctt aag ggt act 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggg aac tac aaa ccc aaa act gac att ccc ggg gtt gtg gaa 
Phe Phe Gly Asn Tyr Lys Pro Lys Thr Asp He Pro Gly Val Val Glu 

325 330 335 

aag tac atg aac aag gag ctg gag ctt gag aaa ttc ate act cac aca 
Lys Tyr Met Asn Lys Glu Leu Glu Leu Glu Lys Phe He Thr His Thr 

340 345 350 

gtg cca ttc teg gaa ate aac aag gec ttt gat tac atg ctg aag gga 
Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Tyr Met Leu Lys Gly 
355 360 365 

gag agt att cgt tgc ate ate acc atg ggt get tga 
Glu Ser He Arg Cys lie lie Thr Met Gly Ala 
370 375 
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<211> 379 
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<212> PRT 

<213> Arabidopsis thaliana 
<400> 42 

Met Ser Thr Thr Gly Gin lie lie Arg Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly Lys Pro Leu Val He Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

Gin Lys His Glu Val Arg He Lys lie Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Leu Phe Pro 
50 55 60 

Arg He Phe Gly His Glu Ala Gly Gly lie Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Leu Gin Pro Gly Asp His Val Leu Pro lie Phe Thr 

85 90 95 

Gly Glu Cys Gly Asp Cys Arg His Cys Gin Ser Glu Glu Ser Asn Met 

100 105 110 

Cys Asp Leu Leu Arg He Asn Thr Glu Arg Gly Gly Met He His Asp 
115 120 125 

Gly Glu Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Leu 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Val His Ser Gly Gin Val 
145 150 155 160 

Ala Lys He Asn Pro Asp Ala Pro Leu Asp Lys Val Cys He Val Ser 

165 170 175 

Cys Gly Leu Ser Thr Gly Leu Gly Ala Thr Leu Asn Val Ala Lys Pro 

180 185 190 

Lys Lys Gly Gin Ser Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Gly Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

Val Asp Phe Asn Ser Lys Arg Phe Asp Gin Ala Lys Glu Phe Gly Val 
225 230 235 240 

Thr Glu Cys Val Asn Pro Lys Asp His Asp Lys Pro He Gin Gin Val 

245 250 255 

He Ala Glu Met Thr Asp Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

Gly Ser Val Gin Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro Ser Lys Asp Asp Ala Phe 
290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Tyr Lys Pro Lys Thr Asp He Pro Gly Val Val Glu 

325 330 335 

Lys Tyr Met Asn Lys Glu Leu Glu Leu Glu Lys Phe He Thr His Thr 

340 34 5 350 
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Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Tyr Met Leu Lys Gly 
355 360 365 

Glu Ser He Arg Cys He lie Thr Met Gly Ala 
370 375 

<210> 43 
<211> 1140 
<212> DNA 

<213> Hordeum vulgare 

<220> 
<221> CDS 
<222> (1) . . (1137) 

<223> coding for alcohol dehydrogenase 
<400> 43 

atg gcg acg gcc ggc a ag gtg ate aag tgc aaa gec gcg gtg gcg tgg 
Met Ala Thx Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 

1 5 10 15 

gag gcc ggg aag ccg ctg acc atg gag gag gtg gag gtg gcg ccg ccg 
Glu Ala Gly Lys Pro Leu Thr Met Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

cag gcc atg gag gtg cgc gtc aag ate etc ttc acc tec etc tgc cac 
Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gac gtc tac ttc tgg gag gcc aag ggg cag acc ccc atg ttc cct 192 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Met Phe Pro 

50 55 60 

egg ate ttc ggc cat gaa get gga ggc ata gtg gag agt gtt gga gag 
Are He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

age gtg act gat gtt gcc cct ggt gac cac gtc etc cct gtg ttc act 
Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 

85 90 95 

ggg gag tgt aag gaa tgc cca cat tgc aag tct gcg gag age aac atg 
Glv Glu Cys Lys Glu Cys Pro His Cys Lys Ser Ala Glu Ser Asn Met 

100 105 HO 

tgt gat ctg etc agg ate aac acc gac aga ggt gtg atg ate ggg gat 
Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

ggc aag teg cgc ttc tct att ggc ggc aag ccg att tac cat ttc gta 
Gly Lys Ser Arg Phe Ser He Gly Gly Lys Pro He Tyr His Phe Val 

130 135 140 

gag act tec acc ttc agt gag tac act gtc atg cat gtc ggt tgt gtt 
Glv Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

gcc aag ate aac cct gag get ccc ctt gat aaa gtc tgt gtt ctt age 
Ala Lys He Asn Pro Glu Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

tgt ggt att tgc act ggt ctt ggc gcg tea att aat gtt gca aaa cca 
Cys Gly He Cys Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 

180 185 1*° 
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cca aag ggt tec aca gtg gcg ata ttt ggg eta gga get gtt ggc ctt 624 
Pro Lys Gly Ser Thr Val Ala lie Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

get get gca gaa ggt gca agg att gca ggt gca tea agg ate att ggt 672 
Ala Ala Ala Glu Gly Ala Arg lie Ala Gly Ala Ser Arg He He Gly 
210 215 220 

gtt gac ctg aac gee age aga ttt gaa gag get agg aag ttt ggc tgc 720 
Val Asp Leu Asa Ala Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

acg gaa ttt gtg aac ccg aaa gat cac acc aag cca gtt cag cag gtg 76 8 
Thr Glu Phe Val Asn Pro Lys Asp His Thr Lys Pro Val Gin Gin Val 

245 250 255 

etc get gac atg aca aat ggc gga gtt gac cgc agt gtt gag tgc act 816 
Leu Ala Asp Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

ggc aac gtc aat get atg ata caa gca ttt gaa tgt gtt cat gat ggc 864 
Gly Asn Val Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

tgg ggt gta get gtg ctg gtg ggt gtg cca cac aag gac get gaa ttc 912 
Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

aag acc cac ccg atg aac ttc ctg aat gag agg acc ctg aag ggc acc 960 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggt aac ttc aag ccg cgc act gac ctg ccc aat gtc gtg gag 1008 
Phe Phe Gly Asn Phe Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 

325 330 335 

atg tac atg aag aag gag ctg gag gtg gag aag ttc ate aca cac age 10 56 
Met Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe lie Thr His Ser 

340 345 350 

gtg ccg ttc teg gag ata aac aag gec ttc gac ctt atg gcg aag ggg 1104 
Val Pro Phe Ser Glu lie Asn Lys Ala Phe Asp Leu Met Ala Lys Gly 
355 360 365 

gag ggc ate cgt tgc ate ate cgc atg gac aac tag 114 0 

Glu Gly He Arg Cys He He Arg Met Asp Asn 
370 375 

<210> 44 
<211> 379 
<212> PRT 

<213> Hordeum vulgare 
<400> 44 

Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly Lys Pro Leu Thr Met Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 



Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Met Phe Pro 
50 55 60 
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Arg He Phe GXy His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Tar 

85 90 95 

Glv Glu Cys Lys Glu Cys Pro His Cys Lys Ser Ala Glu Ser Asa Met 

100 105 I 10 

cys Asp Leu Leu Arg He Asa Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

Gly Lys Ser Arg Phe Ser He Gly Gly Lys Pro He Tyr His Phe Val 

130 135 14° 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

Ala Lys He Asn Pro Glu Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 17S 

Cys Gly He Cys Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 

185 A=rV# 



180 



Pro Lys Gly ser Thr Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 

200 205 



195 



Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 

210 215 220 

Val Asp Leu Asa Ala Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 

Thr Glu Phe Val Asn Pro Lys Asp His Thr Lys Pro Val Gin Gin Val 

245 250 255 

Leu Ala Asp Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

265 270 



260 



Glv Asn val Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 

275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 

290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 



Phe Phe Gly Asn Phe Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 

325 330 335 

Met Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe He Thr His Ser 

345 350 



340 



Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Leu Met Ala Lys Gly 
355 360 3 « 

Glu Gly He Arg Cys He He Arg Met Asp Asn 



370 



375 



<210> 45 

<211> 1140 

<212> DNA 

<213> Oryza sativa 

<220> 
<221> CDS 



p 
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<222> (I). .(1137) 

<223> coding for alcohol dehydrogenase 
<400> 45 

atg gcg acc gca ggg aag gtg ate aag tgc aaa gcg gcg gtg gca tgg 48 
Met Ala Thr Ala Gly Lys Val lie Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gag gec gcg aag ccg ctg gtg ate gag gag gtg gag gtg gcg ccg ccg 96 
Glu Ala Ala Lys Pro Leu Val lie Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

cag gec atg gag gtg cgc gtc aag ate etc ttc acc teg etc tgc cac 144 
Gin Ala Met Glu Val Axg Val Lys lie Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gac gtc tac ttc tgg gag gec aag gga cag act ccc gtg ttc cct 192 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

egg ate ttc ggc cat gaa get gga ggt att gtg gag agt gtt gga gag 240 
Arg He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

ggt gtg act gat ctt gee cct ggt gac cat gtt etc cct gtg ttc act 288 
Gly Val Thr Asp Leu Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 

85 90 95 

ggg gag tgc aag gag tgt gee cac tgc aag tea gca gag age aac atg 336 
Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 

100 105 110 

tgt gat ctg etc agg ate aac act gac agg ggt gtg atg att ggt gat 384 
Cys Asp Leu Leu Arg lie Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

ggc aaa tea cgc ttt tec ate aac ggg aag ccc att tac cat ttc gtc 432 
Gly Lys Ser Arg Phe Ser lie Asn Gly Lys Pro lie Tyr His Phe Val 
130 135 140 

ggg act teg acc ttc age gag tac act gtc atg cat gtt ggt tgc gtt 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

gcg aag ate aac ccg gca get cca ctt gat aaa gtt tgc gtt ctt age 528 
Ala Lys lie Asn Pro Ala Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

tgt ggt att tct act ggt ctt ggt get aca ate aat gtg gca aag cca 576 
Cys Gly He Ser Thr Gly Leu Gly Ala Thr lie Asn Val Ala Lys Pro 

180 185 190 

cca aag ggt teg acg gtg gcg ata ttt ggt eta gga get gta ggc ctt 624 
Pro Lys Gly Ser Thr Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

get gee gca gaa ggt. gca agg att gca gga gcg tea agg ate att ggc 672 
Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

att gac ctg aac gee aac aga ttt gaa gaa get agg aaa ttt ggt tgc 72 0 
He Asp Leu Asn Ala Asn Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

act gaa ttt gtg aac cca aag gac cat gac aag cca gtt cag cag gta 7 68 
Thr Glu Phe Val Asn Pro Lys Asp His Asp Lys Pro Val Gin Gin Val 

245 250 
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ctt get gag atg acc aat ggc gga gtt gac cgc age gtt gaa tgc act 
Leu Ala Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

ggc aac ate aac gec atg ate caa gca ttt gaa tgt gtt cat gat ggc 
Gly Asn lie Asn Ala Met lie Gin Ala Phe Glu Cys Val His Asp Gly 

* -yon 285 



816 



864 



280 285 

tag ggt gtt get gtt ttg gtc ggc gtg cca cae aag gac gec gag ttc 
Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 

290 295 300 

aag acc cae ccg atg aac ttc ctg aac gag agg act etc aag gga acc 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggc aac tac aag cca cgc acc gat ctg cec aac gtc gtc gag 
Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 

325 330 335 

etc tae atg aag aag gag ctg gag gtg gag aag ttc ate aca eac age 
Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe He Thr Hxs Ser 

340 345 350 

gtg ccg ttc teg gag ate aac acg gcg ttc gac ctg atg cae aag ggc 
Val Pro Phe Ser Glu He Asu Thr Ala Phe Asp Leu Met Hxs Lys Gly 
355 360 365 

gag ggc ate cgc tgc ate ate cgc atg gag aac tga 
Glu Gly He Arg Cys He He Arg Met Glu Asn 
370 375 

<210> 46 

<211> 379 

<212> PRT 

<213> Oryza sativa 

Me£°Ala 6 Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 

13 10 15 

Glu Ala Ala Lys Pro Leu Val He Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 

50 55 60 

Arc He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Leu Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 

85 90 95 

Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 

100 105 11° 

Cvs Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 

115 120 125 

Gly Lys Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Val 

130 135 140 

Gly Thr ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly cys Val 
^45 150 155 ~ - ^ 



912 



960 



1008 



1056 



1104 



1140 
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Ala Lys lie Asn Pro Ala Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

Cys Gly lie Ser Thr Gly Leu Gly Ala Thr lie Asn Val Ala Lys Pro 

180 185 190 

Pro Lys Gly Ser Thr Val Ala lie Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Ala Ala Ala Glu Gly Ala Arg lie Ala Gly Ala Ser Arg lie lie Gly 
210 215 220 

lie Asp Leu Asn Ala Asn Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

Thr Glu Phe Val Asn Pro Lys Asp His Asp Lys Pro Val Gin Gin Val 

245 250 255 

Leu Ala Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

Gly Asn lie Asn Ala Met: lie Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

Lys Thr His Pro Met: Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp Leu Pro. Asn Val Val Glu 

325 330 335 

Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe lie Thr His Ser 

340 345 350 

Val Pro Phe Ser Glu lie Asn Thr Ala Phe Asp Leu Met His Lys Gly 
355 360 365 

Glu Gly lie Arg Cys lie Tie Arg Met Glu Asn 
370 375 



<210> 47 
<211> 1140 
<212> DNA 
<213> Zea mays 

<220> 

<221> CDS 

<222> (1)..{1137) 

<223> coding for alcohol dehydrogenase 
<400> 47 

atg gcg acc gcg ggg aag gtg ate aag tgc aaa get gcg gtg gca tgg 4 8 
Met Ala Thr Ala Gly Lys Val lie Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gag gec ggc aag cca ctg teg ate gag gag gtg gag gta gcg cct ccg 9 6 
Glu Ala Gly Lys Pro Leu Ser lie Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

cag gee atg gag gtg cgc gtc aag ate etc ttc acc teg etc tgc cac 144 
Gin Ala Met Glu Val Arg Val Lys lie Leu Phe Thr Ser Leu Cys His 
35 40 45 
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acc gac gtc tac ttc tgg gag gcc aag ggg cag act ccc gtg ttc cct 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin rhr Pro Val Phe Pro 

50 55 60 

egg ate ttt ggc cat gag get gga ggt ate ata gag agt gtt gga gag 
Arc He Phe Gly His Glu Ala Gly Gly He He Glu Ser Val Gly Glu 
65 70 75 80 

ggt gtg act gac gta get ccg ggc gac cat gtc ctt cct gtg ttc act 
Glv Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 

85 90 95 

ggg gag tgc aag gag tgc gcc cac tgc aag teg gca gag age aac atg 
Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 

100 105 110 

tgt gat ttg etc agg ate aac act gac cgc ggt gtg atg att ggc gat 
Cvs Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 

115 120 125 

ggc aag teg egg ttt tea ate aat ggg aag cct ate tac cac ttt gtt 
Gly Lys Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Val 

130 135 140 

qqg act tec acc ttc age gag tac acc gtc atg cat gtc ggt tgt gtt 
Glv Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
14 | 150 155 160 

gca aag ate aac cct cag get ccc ctt gat aaa gtt tgc gtc ctt age 
Ala Lys He Asn Pro Gin Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

tgt ggt att tct act ggt ctt ggt gca tea att aat gtt gca aaa cct 
Cvs Glv He Ser Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 

180 1B5 190 

ccg aag ggt teg aca gtg get gtt ttc ggt tta gga gcc gtt ggt ctt 
Pro Lys Gly Ser Thr Val. Ala Val Phe Gly Leu Gly Ala Val Gly Leu 

195 200 205 

gcc get gca gaa ggt gca agg att get gga gcg tea agg ate att ggt 
Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 

210 215 220 

gtc gac ctg aac ccc age aga ttc gaa gaa get agg aag ttc ggt tgc 
val Asp Leu Asn Pro Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 * 230 235 240 

act gaa ttt gtg aac cca aaa gac cac aac aag ccg gtg cag gag gta 
Thr Glu Phe Val Asn Pro Lys Asp His Asn Lys Pro Val Gin Glu Val 

245 250 255 

ctt get gag atg acc aac gga ggg gtc gac cgc age gtg gaa tgc act 
Leu Ala Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 

260 265 270 

ggc aac ate aat get atg ate caa get ttc gaa tgt gtt cat gat ggc 
Gly Asn He Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 

275 280 285 

tgg ggt gtt gcc gtg ctg gtg ggt gtg ccg cat aag gac get gag ttc 
Tro Glv Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 

P 290 295 300 

aag acc cac ccg atg aac ttc ctg aac gaa agg acc ctg aag ggg acc 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 
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ttc ttt ggc aac tat aag cca cgc act gat ctg cca aat gtg gtg gag 1008 
Phe Phe Gly Asn Tyr Lys pro Arg Thr Asp Leu Pro Asn Val Val Glu 

325 330 335 

ctg tac atg aaa aag gag ctg gag gtg gag aag ttc ate acg cac age 1056 
Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe lie Thr His Ser 

340 345 350 

gtc ccg ttc gcg gag ate aac aag gcg ttc aac ctg atg gee aag ggg 1104 
Val Pro Phe Ala Glu lie Asn Lys Ala Phe Asn Leu Met Ala Lys Gly 
355 360 365 

gag ggc ate cgc tgc ate ate cgc atg gag aac tag 1140 
Glu Gly lie Arg Cys lie lie Arg Met Glu Asn 
370 375 

<210> 48 

<211> 379 

<212> PRT 

<213> Zea mays 

<400> 48 

Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly Lys Pro Leu Ser He Glu Glu Val Glu Val Ala Pro Pro 

20 25 30 

Gin Ala Met Glu Val Arg Val Lys lie Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

Arg He Phe Gly His Glu Ala Gly Gly He He Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 

85 90 95 

Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 

100 105 110 

Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met lie Gly Asp 
115 120 125 

Gly Lys Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Val 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

Ala Lys He Asn Pro Gin Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

Cys Gly He Ser Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 

180 185 190 

Pro Lys Gly Ser Thr Val Ala Val Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

Val Asp Leu Asn Pro Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 
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Thr Glu Phe Val Asn 

245 

Leu Ala Glu Met Thr 

260 

Gly Asn lie Asn Ala 
275 

Trp Gly Val Ala Val 
290 

Lys Thr His Pro Met 
305 

Phe Phe Gly Asn Tyr 

325 

Leu Tyr Met Lys Lys 

340 

Val Pro Phe Ala Glu 
355 

Glu Gly lie Arg Cys 
370 



72 

Pro Lys Asp His Asn Lys 

250 

Asn Gly Gly Val Asp Arg 

265 

Met lie Gin Ala Phe Glu 
280 

Leu Val Gly Val Pro His 
295 

Asn Phe Leu Asn Glu Arg 
310 315 

Lys Pro Arg Thr Asp Leu 

330 

Glu Leu Glu Val Glu Lys 

345 

He Asn Lys Ala Phe Asn 
360 

He He Arg Met Glu Asn 
375 



Pro Val Gin Glu Val 

255 

Ser Val Glu Cys Thr 
270 

Cys Val His Asp Gly 
2B5 

Lys Asp Ala Glu Phe 
300 

Thr Leu Lys Gly Thr 

320 

Pro Asn Val Val Glu 

335 

Phe He Thr His Ser 
350 

Leu Met Ala Lys Gly 
365 



<210> 49 
<211> 505 
<212> DNA 

<213> Artificial sequence 

<220> a . _ 

<223> Description of the artificial sequence: coding for 

sense KNA- fragment of E.coli codA gene 
<400> 49 

aagcttggct aacagtgtcg aataacgctt tacaaacaat tattaacgcc cggttaccag 60 
gcgaagaggg gctgtggcag attcatctgc aggacggaaa aatcagcgcc attgatgcgc 120 
aatccggcgt gatgcccata actgaaaaca gcctggatgc cgaacaaggt ttagttatac 180 
cgccgtttgt ggagccacat attcacctgg acaccacgca aaccgccgga caaccgaact 240 
ggaatcagtc cggcacgctg tttgaaggca ttgaacgctg ggccgagcgc aaagcgttat 300 
taacccatga cgatgtgaaa caacgcgcat ggcaaacgct gaaatggcag attgccaacg 360 



gcattcagca tgtgcgtacc catgtcgatg tttcggatgc aacgctaact gcgctgaaag 420 
caatgctgga agtgaagcag gaagtcgcgc cgtggattga tctgcaaatc gtcgccttcc ^ 



480 

-y*«;yw w y wy^jwv «~ ■>» a — -- — — -» 

ctcaggaagg gattttgtcg tcgac 



<210> 50 
<211> 27 
<212> DKA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 50 21 
cgtgaatacg gcgtggagtc g 

<210> 51 
<211> 26 
<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Description of the artificial sequence 
oligonucleotide primer 

<400> 51 

cggcaggata atcaggttgg 

<210> 52 
<211> 505 
<212> DNA 

<213> Artificial sequence 



20 



<220> 

<223> Description of the 

antisense KNA- fragment 



artificial sequence: coding for 
of B.coli codA gene 



<400> 52 

gaattcggct aacagtgtcg aataacgctt tacaaacaat tattaacgcc cggttaccag 60 

gcgaagaggg gctgtggcag attcatctgc aggacggaaa aatcagcgcc attgatgcgc 120 

aatccggcgt gatgcccata actgaaaaca gcctggatgc cgaacaaggt ttagttatac 180 

cgccgtttgt ggagccacat attcacctgg acaccacgca aaccgccgga caaccgaact 240 

ggaatcagtc cggcacgctg tttgaaggca ttgaacgctg ggccgagcgc aaagcgttat 300 

taacccatga cgatgtgaaa caacgcgcat ggcaaacgct gaaatggcag attgccaacg 360 

gcattcagca tgtgcgtacc catgtcgatg tttcggatgc aacgctaact gcgctgaaag 4 20 

caatgctgga agtgaagcag gaagtcgcgc cgtggattga tctgcaaatc gtcgccttcc 4 80 
ctcaggaagg gattttgtcg gatcc 505 

<210> 53 
<211> 27 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 53 

gtcaacgtaa ccaaccctgc 20 

<210> 54 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 54 

ggatccgaca aaatcccttc ctgagg 2 6 

<210> 55 
<2il> 5674 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: vector 
construct pBluKS-nitF-STLSl-35S-T 

<400> 55 

ccagcttttg ttccctttag tgagggttaa tttcgagctt ggcgtaatca tggtcatagc 60 
tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 120 
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taaaatataa agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct 180 
« c 3toe=a ggaaacctgt cgtgccagct gcattaatga atcggccaac 240 

gcgcggggag aggcggtttg cgtattgggc gctcttccgo ttcctcgctc aotgactcgc 300 
llllllllqi cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 360 
tSIcacaga aLaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 420 
ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc JJJ 
aocltcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 540 
accaggcgS tccccctgga agctccctcg tgcgctctcc tgttccgacc <=tgccgctta 600 
ccaoltacct gtccgccttt ctcccttcgg gaagcgtggc gctttctca* agctcacgct 660 
gtlggtatct cagtLggtg taggtcgttc gctccaagct gggctgtgtg «^ aa ^° 7 ? f 0 
ccattcaacc cgaccgctge gccttatccg gtaactatcg tcttgagtcc aacccggtaa 780 
cacacgactt a^cgccaclg gcagcagcca ctggtaacag gattagcaga 9«W££9 840 

=35= S32S 253= S= =2= === o 

™= === =2= s== 2= ~™ i 

33= =S £aggga«: t tggtcatgag attatcaaaa 1140 
cSlgatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata ^atgagtaaa 1200 
=S=a cagttaccaa tgcttaatca gtgaggcacc tatctcagcg ^^ctgtctat 1260 
S£?£2o caL^gcc tgactccccg tcgtgtagat £2 
taccatctgg ccccagtgc* gcaatgatac cgcgagaccc acgctcaccg ^tccagatt 1380 
tatcagcalt aaaccagcca gccggaaggg ccgagcgcag «9*«£«£ JJJJ 
ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta w« 

=2= 5= 5= 3= sssss 

S2= Sage SLt^j jjj-jjjgj S= 

=222 SSS^ 233= 2=22 2=2" tagtg^gc 1800 
ggclaccgag S2=«o ccggcgtcaa tacgggataa taccgcgcca catagcagaa 1860 
c=aaag? gctcatcatt ggaaaacgtfc etteggggeg aaaactctoa £££££ ^ 
cgctgttgag atccagttcg atgtaaccca ctogtgcacc caactgatct t^atett i960 
ttactttcal cagegtttet gggtgagcaa aaacaggaag ^aaaatgee ^* aaaaa ^ «« 
gaataagggc gaeaeggaaa tgttgaatac ^catactcUt cctttfctcaa tatt^j 
acatttatca gggttattgt otcatgagcg gatacatatt tgaatgtatt tagaaaaa 
SaaS« ggLccgcgc acatttcccc gaaaagtgcc -ctgacgcg ccctgtagcg 2220 

gC gca tt aag cgcggcggg* ==2 == g£ggc t ?tc 2340 

=3= =3 =22= Sgggttccg a^ag.gc, "~ggcacc 2400 
tcgaccccL aaaacttgat tagggtgatg gttcaegtag ^ggecateg ccctgataga 2460 

3522= =22= 253= 22SSS =553 32=: !S 
== 552= =2=2= 223= 22=3= =2= 
=X3 525= 2= 525= 5=2 =1 

£2=2= gactcactal agggegaatt ggagctcgtc <MM 2880 

223= = =2= 22225 === 1 3000 
=3= 222= 2= == 2= 22222 3120 

2=2 -aat?c 9 t 9 gatg^gj taccctacat tctacaacca 31.0 

=252 £=2t =2= 2= =S =5= SS 
Si EHi f™ 2== 225= 2:22:3 = 



CA 02493364 2005-01-21 



PF 53790 



75 

cataaaaaga ctatttcgtt tcattgacaa tttgtgttta tttgtaaaga aaagtggcaa 3540 

agtggaattt gagttcctgc aagtaagaaa gatgaaataa aagacttgag tgtgtgtttt 3600 

tttcttttat ctgaaagctg caatgaaata ttcctaccaa gcccgtttga ttattaattg 3660 

gggtttggtt ttcttgatgc gaactaattg gttatataag aaactataca atccatgtta 3720 

attcaaaaat tttgatttct cttgtaggaa tatgatttac tatatgagac tttcttttcg 37 BO 

ccaataatag taaatccaaa gatatttgac cggaccaaaa cacattgatc tattttttag 3840 

tttatttaat ccagtttctc tgagataatt cattaaggaa aacttagtat taacccatcc 3900 

taagattaaa taggagccaa actcacattt caaatattaa ataacataaa atggatttaa 3960 

aaaatctata cgtcaaattt tatttatgac atttcttatt taaatttata tttaatgaaa 4020 

tacagctaag acaaaccaaa aaaaaaatac tttctaagtg gtccaaaaca tcaattccgt 4080 

tcaatattat taggtagaat cgtacgacca aaaaaaggta ggttaatacg aattagaaac 414 0 

atatctataa catagtatat attattacct attatgagga atcaaaatgc atcaaatatg 4200 

gatttaagga atccataaaa gaataaattc tacgggaaaa aaaatggaat aaattctttt 4260 

aagtttttta tttgtttttt atttggtagt tctccatttt gttttatttc gtttggattt 4320 

attgtgtcca aatactttgt aaaccaccgt tgtaattctt aaacggggtt ttcacttctt 4380 

ttttatattc agacataaag catcggctgg tttaatcaat caatagattt tatttfctctt 444 0 

ctcaattatt agtaggtttg atgtgaactt tacaaaaaaa acaaaaacaa atcaatgcag 4500 

agaaaagaaa ccacgtgggc tagtcccacc ttgtttcatfc tccaccacag gttcgatctt 4560 

cgttaccgtc tccaatagga aaataaacgt gaccacaaaa aaaaaacaaa aaaaagtcta 4620 

tatattgctt ctctcaagtc tctgagtgtc atgaaccaaa gtaaaaaaca aagactcgac 4680 

ctgcaggcat gcaagcttat cgtcgactac gtaagtttct gcttctacct ttgatatata 4 740 

tataataatt atcattaatt agtagtaata taatatttca aatatttttt tcaaaataaa 4800 

agaatgtagt atatagcaat tgcttttctg tagtttataa gtgtgtatat tttaatttat 486 0 

aact.ttt.cta atatatgacc aaaatttgtt gatgtgcagg tatcaccgga tccatcgaat 49 20 

tcggtacgct gaaatcacca gtctctctct acaaatctat ctctctctat tttctccata 4980 

aataatgtgt gagtagtttc ccgataaggg gaanttaggg ttcttatagg gtttcgctca 5040 

tgtgttgagc atataagaaa cccttagtat gtatttgtat ttgtaaaata cttctatcaa 5100 

taaaatttct aattcctaaa accaaaatcc agtactaaaa tccagatctc ctaaagtccc 5160 

tatagatctt tgtcgtgaat ataaaccaga cacgagacga ctaaacctgg agcccagacg 5220 

ccgttcgaag ctagaagtac cgcttaggca ggaggccgtt agggaaaaga tgctaaggca 5280 

gggttggtta cgttgactcc cccgtaggtt tggtttaaat atgatgaagt ggacggaagg 5340 

aaggaggaag acaaggaagg ataaggttgc aggccctgtg caaggtaaga agatggaaat 5400 

ttgatagagg tacgctacta tacttatact atacgctaag ggaatgcfctg tatttatacc 5460 

ctataccccc taataacccc ttatcaattt aagaaataat ccgcataagc ccccgcttaa 5520 

aaattggtat cagagccatg aataggtcta tgaccaaaac tcaagaggat aaaacctcac 5580 

caaaatacga aagagttctt aactctaaag ataaaagatc tttcaagatc aaaactagtt 5640 

ccctcacacc ggtgacgggg atcgcgatgg gtac 5674 

<210> 56 
<211> 6046 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: binary 
vector pSUNl 

<400> 56 

ttccatggac atacaaatgg acgaacggat aaaccttttc acgccctttt aaatatccga 60 

ttattctaat aaacgctctt ttctcttagg tttacccgcc aatatatcct gtcaaacact 120 

gatagtttaa actgaaggcg ggaaacgaca atcagatcta gtaggaaaca gctatgacca 180 

tgattacgcc aagcttgcat gcctgcaggt cgactctaga ctagtggatc cgatatcgcc 240 

cgggctcgag gtaccgagct cgaattcact ggccgtcgtt ttacaacgac tcagctgctt 300 

ggtaataatt gtcattagat tgtttttatg catagatgca ctcgaaatca gccaatttta 360 

gacaagtatc aaacggatgt taattcagta cattaaagac gtccgcaatg tgttattaag 420 

ttgtctaagc gtcaatttgt ttacaccaca atatatcctg ccaccagcca gccaacagct 480 

ccccgaccgg cagctcggca caaaatcacc acgcgttacc accacgccgg ccggccgcat 540 

ggtgttgacc gtgttcgccg gcattgccga gttcgagcgt tccctaatca tcgaccgcac 600 



CA 02493364 2005-01-21 



PF 537 90 

76 



=32 2122 1222 1222 ~ = EHSH i 

™2 5223 '-3223 3332? =2 2 
H2°c2 ^-w. J5S33 '322= 332S2 2 
c " cc "l'S i:«r«=S £223 32°?"I 1223* 22~« »» 
!2|~!2 ;2321 12=23 ,==<*„===,, ...,,=«=,== 

™£ £223 2122 2.22 ,^232 3233? 2 
£212 222, jjjgjgj j«™ -23* 9 . 
S5SS 2332 2323 2 23 2 j-S- - 
T222 323 2 22 22|1. 2 = - 

i.22 3223 3322 232 2332 322111 »H 
™ 2= I~ S~ =2 2212 12 

cacgctaagt gccggccgtc cgagcgcacg «9cagcaag ^9«acgt tgg j ^ 
qgcagacacg ccagccatga agcgggtcaa ctttcagttg co 99C9gagg ptoa , tacat 2 040 

322 S 2 =2 5 =I 2*22 %2 12 
~I ==£ 21-23 13332 22S 2233 S 

acaaatcggc gcggcgctgg 9^atgacct ggtgg g 9 * C g gC cgctga 2400 

SEES 1221H S=S= 22, M~ *™~ 
-.221 '.=3 22 2 g r4 .22 ™= 

tggcgaggtg atccgctacg agcttccaga ^99" g g atctaaocga 2700 

cggcatggcc agtgtgtggg attacgacct 9gtactgatg gcgg~ aca 2760 

5SS5= 5« = ESS S=SS =S5S ;s°. 

53=S «S JJgtjj-. J-J-J 

cgtaaagagc gaaaccgggc ggccggagta <=atcgagatc 9agctagctg attgg g 
ccgcgagatc acagaaggca agaacccgga cg^gctgacg 9ttcaccccg ^ 312Q 

gatcgatocc 99catcggcc 9«ttctcta ~ a « g ^ g^'dccg gagagttcaa 3180 

JS 2233 122- c S L2 ,,™ 

SS552 ?12?2 22 222 ig- -™ - 

aggggaaaaa ggtcgaaaag Stctctttcc tgcgg g ccgtaca ttgggaaccg 3480 

gccgtacatt gggaaccgga aoccgtacat ^ggaaccca * ag ==* cctaaaactc 3540 

S=S 222? 32:25 12 222 =™, 
~»22 1221" 3232 222 2 2.= »» 

cLtctacca gggcgcggac aagccgcgcc 9tcgccact= 9accgccggc gccca ^ 
aggcaccctg cctcgcgcgt "cggtg.** ^acaagcc cgLagggcg 3900 

SKSSJ SSS SK35S S-SJ ccag.cacg. agcga.agcg 3960 
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gagtgtatac tggcttaact atgcggcatc agagcagatt gtactgagag tgcaccatat 4 020 
gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcaggc gctcttccgc 4080 
ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 4140 
ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 4 200 
agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 4260 
taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 4320 
cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 4380 
tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 4440 
gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 4500 
gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 4560 
tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 4620 
gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 4680 
cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 474 0 
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 4 800 
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 4860 
ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgca 4 920 
tgatatatct cccaatttgt gtagggctta ttatgcacgc ttaaaaataa taaaagcaga 4980 
cttgacctga tagtttggct gtgagcaatt atgtgcttag tgcatctaac gcttgagtta 5040 
agccgcgccg cgaagcggcg tcggcttgaa cgaatttcta gctagacatt atttgccgac 5100 
taccttggtg atctcgcctt tcacgtagtg gacaaattct tccaactgat ctgcgcgcga 5160 
ggccaagcga tcttcttctt gtccaagata agcctgtcta gcttcaagta tgacgggctg 5220 
atactgggcc ggcaggcgct ccattgccca gtcggcagcg acatccttcg gcgcgatttt 5280 
gccggttact gcgctgtacc aaatgcggga caacgtaagc actacatttc gctcatcgcc 5340 
agcccagtcg ggcggcgagt tccatagcgt taaggtttca tttagcgcct caaatagatc 5400 
ctgttcagga accggatcaa agagttcctc cgccgctgga cctaccaagg caacgctatg 5460 
ttctcttgct tttgtcagca agatagccag atcaatgtcg atcgtggctg gctcgaagat 5520 
acctgcaaga atgtcattgc gctgccattc tccaaattgc agttcgcgct tagctggata 5580 
acgccacgga atgatgtcgt cgtgcacaac aatggtgact tctacagcgc ggagaatctc 5640 
gctctctcca ggggaagccg aagtttccaa aaggtcgttg atcaaagctc gccgcgttgt 57 00 
ttcatcaagc cttacggtca ccgtaaccag caaatcaata tcactgtgtg gcttcaggcc 5760 
gccatccact gcggagccgt acaaatgtac ggccagcaac gtcggttcga gatggcgctc 5820 
gatgacgcca actacctctg atagttgagt cgatacttcg gcgatcaccg cttcccccat 5880 
gatgtttaac tttgttttag ggcgactgcc ctgctgcgta acatcgttgc tgctccataa 5940 
catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg gatgcccgag gcatagactg 6000 
taccccaaaa aaacagtcat aacaagccat gaaaaccgcc actgcg 604 6 

<210> 57 
<211> 9838 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: Transgenic 
expression vector for codA dsRNA pSUNl-codA-RNAi 

<400> 57 

cgaattcact ggccgtcgtt ttacaacgac tcagctgctt ggtaataatt gtcattagat 60 

tgtttttatg catagatgca ctcgaaatca gccaatttta gacaagtatc aaacggatgt 120 

taattcagta cattaaagac gtccgcaatg tgttattaag ttgtctaagc gtcaatttgt 180 

ttacaccaca atatatcctg ccaccagcca gccaacagct ccccgaccgg cagctcggca 24 0 

caaaatcacc acgcgttacc accacgccgg ccggccgcat ggtgttgacc gtgttcgccg 300 

gcattgccga gttcgagcgt tccctaatca tcgaccgcac ccggagcggg cgcgaggccg 360 

ccaaggcccg aggcgtgaag tttggccccc gccctaccct caccccggca cagatcgcgc 42 0 

acgcccgcga gctgatcgac caggaaggcc gcaccgtgaa agaggcggct gcactgcttg 4 80 

gcgtgcatcg ctcgaccctg taccgcgcac ttgagcgcag cgaggaagtg acgcccaccg 540 

aggccaggcg gcgcggtgcc ttccgtgagg acgcattgac cgaggccgac gccctggcgg 600 

ccgccgagaa tgaacgccaa gaggaacaag catgaaaccg caccaggacg gccaggacga 660 

accgtttttc attaccgaag agatcgaggc ggagatgatc gcggccgggt acgtgttcga 7 20 
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gccgcccgcg cacgtctcaa ccgtgcggct gcatgaaatc ctggccggtt ^gtctgatgc 780 
caagcJggcg gcctggccgg ccagcttggc cgctgaagaa accgagcgcc ^cgtctaaa 840 
taggtgatgl gtat^gagt aaaacagctt gcgtcatgcg gtcgctgcgt atatgatgcg 900 
atgSaaat Laeaaltac gcaaggggaa cgcatgaagg ttatcgctgt ^taacoag 960 
aaaaacqqqt caggcaagac gaccatcgca acccatctag cccgcgecct gcaactcgcc 1020 
ggggccgatg SSgttSg* cgattccgat ccccagggca gtgcccgcga ttgggcggcc 1080 
g?g1gggaag atcaaccgct aaccgttgtc ggcatcgacc gcccgacgat tgaccgcgac 1140 
gSalggcca tcggccggcg cgacttcgta gtgatcgacg gagcgcccca ggcggcggac 1200 
ItTatllltt ccgcgatoaa ggcagccgac ttcgtgctga ttccggtgca gccaagccct 1260 
tacgacaS; gggccaccgc cgacctggtg gagctggtta agcagcgcat tgaggtcacg 1320 
gatggaaggc llcaagcggc cfcttgtcgtg tcgcgggcga tcaaaggcac Scgcatcggc 1380 
ggtglggitg ccgaggcgot ggccgggtao gagctgccca ttcttgagtc ^cgtatcacg 1440 
caqcgcgtga gctacccagg cactgccgcc gccggcacaa ccgttcttga atcagaaccc isoo 
qagggcgacg ctgcccgcga ggtccaggcg ctggccgctg aaattaaatc aaaactcatt 1560 
?££25 aggtaaagag Laatgagca aaagcacaaa cacgctaagt ^gccgtc 1620 
cgagcgcacg clgcagcaag gctgcaacgt tggccagcct ggcagacacg e«a«°«*JJ JJJJ 
agcgggtcaa ctttcagttg ccggcggagg atcacaocaa gctgaagatg "*J 
gccaaggcaa gaccattacc gagctgctat ctgaatacat cgcgcagcta ccagagtaaa 1800 
Igagclaatg aataaatgag tagatgaatt ttagcggcta aaggaggcgg catggaaaat 1860 
caagaacaac caggcaccga cgccgtggaa tgccccatgt gtggaggaac 999«^gg 1920 
ccaggcgtaa gcggctggg* tgtctgccgg ccctgcaatg gcactggaao ccccaagccc 1980 
aaooaatcgg cgtgagcggt cgcaaaocat ccggcccggt acaaatcggc gcggcgctgg 2O40 
atgatgacc? ggtggagaag tlgaaggccg egcaggccgc ccagcggoaa cgcatcgagg 2100 
cagaagcacg ccccggtgaa tcgtggcaag cggccgctga tcgaatccgc «JJ 
qqcaaccgcc ggcagccggt gcgccgtcga ttaggaagcc goccaagggc gacgagcaac 2ZZ0 
llll*t£tt cgttccgltg ctctatgacg tgggcacccg cgatagtcgc ^atcatgg 2280 
acgtggccgt tttccgtctg tcgaagcgtg accgacgagc tggcgaggtg ^=^ctacg 2340 
agcttccaga cgggcacgta gaggtttccg cagggccggc cggcatggcc agtgtgtggg 2400 
a^acgacct ggtactgatg gcggtttccc atctaaccga atccatgaac cgataccggg 2460 
aagggLggg agacaagccc ggccgcgtgt tccgtccaca cgttgcggac gt «*«ag* 2520 
tc?glcggcg agccgatggc ggaaagcaga aagacgacct ggtagaaacc tgcattcggt 2580 
tlllollcll gcacgttglc atgcagcgta cgaagaaggc caagaacggc "40 
oaotatccga gggtgaagcc ttgattagcc gctacaagat cgtaaagago gaaaecgggc 2700 
ggccggagta caSgagatc gagctagctg attggatgta ccgcgagatc acagaaggca 2760 
Igaacccgga cgtgctlacg gttcaccccg attacttttt gatcgatccc 99«tcggcc 2820 
g^ttctcla ccgcctggca cgccgcgccg caggcaaggc 22225 llll 

agacgatcta cgaacgcagt ggcagcgccg gagagttcaa gaagttctgt "oaccgtgc 2940 
qcaagctgat cgggtcaaat gacotgccgg agtacgattt gaaggaggag gcggggcagg 3000 
ctggcccgat cclagtcatg cgctaccgca acctgatcga gggcgaagca tccgccggtt 3060 
celaatgtac ggagcagatg ctagggcaaa ttgccctagc aggggaaaaa ggtogaaaag 3120 
otctctttcc Stggatagc acgtacattg ggaacccaaa gccgtacatt gggaaccgga 3180 
Ic^SacS SggLccL aagccgtaca tt gggaaccg gtcacacat, ^^2 
atataaaaga gaaaaaaggc gatttttccg cctaaaactc tttaaaactt attaaaactc jjuw 
ttaaaacccg cctggcctgt gcataactgt ctggccagcg cacagecgaa gagctgcaaa 3360 
iaocacctac ect2cggtcg ctgcgctccc tacgccccgc cgcttegcgt cggcctatcg 3420 
"gccg^tgg "gctclaaa atggLggcc t acggccagg caatctacca 999cgcggac 3480 
aagccgcgcl gtcgccactc gaccgccggc gcccacatca aggcaccetg c=tcgcgcgt 3540 
ttcaataatg acggtgaaaa cctctgacac atgcagctcc cggagacggt cacagcttg* 3600 
cSaagcgg atgccgggag cagaclagcc cgtcagggcg cgtcagcggg tgttggcggg 3660 
tgtcggggcg cagccatgac ccagtcacgt agcgatagcg gagtgtatac tggcttaact 3720 
ifqcqqcatc agagcagatt gtactgagag tgcaccatat gcggtgtgaa ataccgcaca 3780 
galgcg"ag gaglaaltac cgcatcaggc gc t c tt ccgc ttcctcgctc actgactcgc 3840 
tgcLtcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 3900 
tatccacaqa atcag|ggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 3960 
«agga"cg taLffggcc gcgttgctgg cgtttttcca taggctccgc ^cctgacg 4020 
agcltcacaa aaatcgacgc tcaagtcaga ggtggcgaaa ccogacagga ctataaagat 4080 
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accaggcgtt tccccctgga agctccctcg tgcgctct.cc tgttccgacc ctgccgctta 4140 
ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct 4200 
gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 4260 
ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa 4320 
gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 4380 
taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag 4440 
tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 4500 
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 4560 
cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc 4620 
agtggaacga aaactcacgt taagggattt tggtcatgca tgatatatct cccaatttgt 4680 
gtagggctta ttatgcacgc ttaaaaataa taaaagcaga cttgacctga tagtttggct 4740 
gtgagcaatt atgtgcttag tgcatctaac gcttgagtta agccgcgccg cgaagcggcg 4 800 
tcggcttgaa cgaatttcta gctagacatt atttgccgac taccttggtg atctcgcctt 4860 
tcacgtagtg gacaaattct tccaactgat ctgcgcgcga ggccaagcga tcttcttctt 4920 
gtccaagata agcctgtcta gcttcaagta tgacgggctg atactgggcc ggcaggcgct 4 980 
ccattgccca gtcggcagcg acatccttcg gcgcgatttt gccggttact gcgctgtacc 5040 
aaatgcggga caacgtaagc actacatttc gctcatcgcc agcccagtcg ggcggcgagt 5100 
tccatagcgt taaggtttca tttagcgcct caaatagatc ctgttcagga accggatcaa 5160 
agagttcctc cgccgctgga cctaccaagg caacgctatg ttctcttgct tttgtcagca 5220 
agatagccag atcaatgtcg atcgtggctg gctcgaagat acctgcaaga atgtcattgc 5280 
gctgccattc tccaaattgc agttcgcgct tagctggata acgccacgga atgatgtcgt 534 0 
cgtgcacaac aatggtgact tctacagcgc ggagaatctc gctctctcca ggggaagccg 54 00 
aagtttccaa aaggtcgttg atcaaagctc gccgcgttgt ttcatcaagc cttacggtca 54 60 
ccgtaaccag caaatcaata tcactgtgtg gcttcaggcc gccatccact gcggagccgt 552 0 
acaaatgtac ggccagcaac gtcggttcga gatggcgctc gatgacgcca actacctctg 55 80 
atagttgagt cgatacttcg gcgatcaccg cttcccccat gatgtttaac tttgttttag 564 0 
ggcgactgcc ctgctgcgta acatcgttgc tgctccataa catcaaacat cgacccacgg 5700 
cgtaacgcgc ttgctgcttg gatgcccgag gcatagactg taccccaaaa aaacagtcat 5760 
aacaagccat gaaaaccgcc actgcgttcc atggacatac aaatggacga acggataaac 5820 
cttttcacgc ccttttaaat atccgattat tctaataaac gctcttttct cttaggttta 5880 
cccgccaata tatcctgtca aacactgata gtttaaactg aaggcgggaa acgacaatca 594 0 
gatctagtag gaaacagcta tgaccatgat tacgccaagc ttgcatgcct gcaggtcgac 6000 
tctagactag tggatccgat atcgcccggg ctcgaggtac ccatcgcgat ccccgtcacc 6 06 0 
ggtgtgaggg aactagtttt gatcttgaaa gatcttttat ctttagagtt aagaactctt 6120 
tcgtattttg gtgaggtttt atcctcttga gttttggtca tagacctatt catggctctg 6180 
ataccaattt ttaagcgggg gcttatgcgg attatttctt. aaattgataa ggggttatta 6240 
gggggtatag ggtataaata caagcattcc cttagcgtat agtataagta tagtagcgta 6300 
cctctatcaa atttccatct tcttaccttg cacagggcct gcaaccttat ccttccttgt 6360 
cttcctcctt ccttccgtcc acttcatcat atttaaacca aacctacggg ggagtcaacg 64 20 
taaccaaccc tgccttagca tcttttccct aacggcctcc tgcctaagcg gtacttctag 64 80 
cttcgaacgg cgtctgggct ccaggtttag tcgtctcgtg tctggtttat attcacgaca 6540 
aagatctata gggactttag gagatctgga ttttagtact ggattttggt tttaggaatt 6600 
agaaatttta ttgatagaag tattttacaa atacaaatac atactaaggg tttcttatat 6660 
gctcaacaca tgagcgaaac cctataagaa ccctaanttc cccttatcgg gaaactactc 6720 
acacattatt tatggagaaa atagagagag atagatttgt agagagagac tggtgatttc 6780 
agcgtaccga attcggctaa cagtgtcgaa taacgcttta caaacaatta ttaacgcccg 6840 
gttaccaggc gaagaggggc tgtggcagat tcatctgcag gacggaaaaa tcagcgccat 6900 
tgatgcgcaa tccggcgtga tgcccataac tgaaaacagc ctggatgccg aacaaggttt 6960 
agttataccg ccgtttgtgg agccacatat tcacctggac accacgcaaa ccgccggaca 7020 
accgaactgg aatcagtccg gcacgctgtt tgaaggcatt. gaacgctggg ccgagcgcaa 7 080 
agcgttatta acccatgacg atgtgaaaca acgcgcatgg caaacgctga aatggcagat 714 0 
tgccaacggc attcagcatg tgcgtaccca tgtcgatgtt tcggatgcaa cgctaactgc 7200 
gctgaaagca atgctggaag tgaagcagga agtcgcgccg tggattgatc tgcaaatcgt 7260 
cgccttccct caggaaggga ttttgtcgga tccggtgata cctgcacatc aacaaatttt 7 320 
ggtcotatat tagaaaagtt ataaattaaa atatacacac ttataaacta cagaaaagca 7 380 
attgctatat actacattct tttattttga aaaaaatatt tgaaatatta tattactact 74 40 
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aattaatgat aattattata tatatatcaa aggtagaago agaaacttao gtagtcgacg 7500 
acaaaatccc ttcctgaggg aaggcgacga tttgcagatc aatccacggc gcgacttcct 7560 
gcttcacttc cagcattgct ttcagcgcag ttagcgttgc atccgaaaca tcgacatggg 7620 
tacgcacatg ctgaatgccg ttggcaatct gccatttcag cgtttgccat gcgcgttgtt 7680 
tcacatcgtc atgggttaat aacgctttgc gctcggccca gcgttcaatg ccttcaaaca 7740 
gcgtgccgga ctgattccag ttcggttgtc cggcggtttg cgtggrgtcc aggtgaatat 7800 
gtggctccac aaacggcggt ataactaaac cttgttcggc atccaggctg ttttcagtta 7860 
tgggcatcac gccggattgc gcatcaatgg cgctgatttt tccgtcctgc agatgaatct 7920 
gccacagccc ctcttcgcct ggtaaccggg cgttaataat tgtttgtaaa gcgttattcg 7980 
Lactgttag ccaagcttgc atgcctgcag gtcgagtctt tgttttttac tttggttcat 8040 
gacactcaga gacttgagag aagcaatata tagacttttt tttgtttttt ^ttgtggtc 8100 
acgtttattt tcctattgga gaeggtaacg aagatcgaac ctgtggtgga aatgaaacaa 8160 
ggtgggacta gcccacgtgg tttcttttct ctgcattgat ttgtttttgt tttttttgta 8220 
aagtrcacat caaacctact aataattgag aagaaaaata aaatctattg attgattaaa 8280 
ccagccgatg otttatgtct gaatataaaa aagaagtgaa aaccccgttt aagaattaca 8340 
acggtggttt acaaagtatt tggacacaat aaatccaaac gaaataaaac aaaatggaga 8400 
actaccaaat aaaaaacaaa taaaaaactt aaaagaattt attccatttt ttttcccgta 8460 
gaatttattc ttttatggat tccttaaatc catatttgat gcattttgat tcctcataat 8520 
aggtaataat atatactatg ttatagatat gtttctaatt cgtattaacc tacctttttt 8580 
tggtcgtacg attctaccta ataatattga acggaattga tgttttggac cacttagaaa 8640 
gtattttttt tttggtttgt cttagctgta tttcattaaa tataaattta aataagaaat 8700 
gtcataaata aaatttgacg tatagatttt ttaaatccat tttatgttat ttaatatttg 8760 
eaatgtgagt ttggctccta tttaatctta ggatgggtta atactaagtt ttccttaatg 8820 
aattatctca gagaaactgg attaaataaa ctaaaaaata gatcaatgtg ttttggtccg 8880 
qtcaaatato tttggattta ctattattgg cgaaaagaaa gtctcatata gtaaatcata 8940 
ttcctacaag agaaatcaaa atttttgaet taacatggat tgratagttt cttatataac 9000 
caattagttc gcatcaagaa aaccaaaccc caattaataa tcaaacgggc ttggtaggaa 9060 
tatttcattg cagctttcag ataaaagaaa aaaacacaca ctcaagtctt ttatttcatc 9120 
tttcfctactt gcaggaactc aaattccact ttgccacttt tctttacaaa taaacacaaa 9180 
ttgtcaatga aacgaaatag tctttttatg caaacactgt ttgtcttttt tcgatcacgt 9240 
ttctgattgt gacagccatc catatatata gggaatgtaa aacaacaaca tgtgaagtca 9300 
catatacgta atggtttagc atagcttcta ttttcgttgt caatattagt cattccaaaa 9360 
catttttaag aaaaataaat taatatatgt atattcttgg aactaatgta tgtggaaata 9420 
caqtaactta attattaaac attctaaatg caaatatgca aagaaaaaaa agaaaagaac 9480 
acaactgaaa tcaaagccag attcataata attggctaca tggttgtaga atgtagggta 9540 
acacaacatc cagaattgaa cactcaaatt ggatgataga tggataatct ttagatacaa 9600 
gagaattggt tctcttccat tattaacgaa aataaagaaa aaaagtttag cataaaagtt 9660 
tgaaactcaa cataacattt tgaacttgac tccttcatag gagtgacatg aactgacgaa 9720 
tcacaaccga ttacttgttt gagtcatctt ccgctttctc caccttcgaa atgaatgtga 9780 
ccggtttctt cgggtgctca tttacggtca agtgtaaaac atctggtctc gaegagct 9838 

<210> 58 
<211> 14184 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description o£ the artificial sequence: Expression 
vector pSUNl-codA-RNAi-At.Act.-2-At.Als-R-ocsT 

<400> 58 , „ 

ctgcttggta ataattgtca ttagattgtt tttatgcata gatgcactcg aaatcagcca 60 
attttagaca agtatcaaac ggatgttaat tcagtacatt aaagacgtcc gcaatgtgtt 120 
attaagttgt ctaagcgtca atttgtttac accacaatat atcctgccac cagccagcca 180 
acagctcccc gaccggcagc tcggcacaaa atcaccacgc gttaccacca cgccggccgg 240 
ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc taatcatcga 300 
ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg gcccccgccc 360 
taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg aaggccgcac 420 
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cgtgaaagag gcggctgcac tgcttggcgt gcatcgctcg accctgtacc gcgcacttga 4 80 
gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc gtgaggacgc 540 
attgaccgag gccgacgccc tggcggccgc cgagaatgaa cgccaagagg aacaagcatg 600 
aaaccgcacc aggacggcca ggacgaaccg tttttcatta ccgaagagat cgaggcggag 660 
atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg tctcaaccgt gcggctgcat 720 
gaaatcctgg ccggtttgtc tgatgccaag ctggcggcct ggccggccag cttggccgct 780 
gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat ttgagtaaaa cagcttgcgt 840 
catgcggtcg ctgcgtatat gatgcgatga gtaaataaac aaatacgcaa ggggaacgca 900 
tgaaggttat cgctgtactt aaccagaaag gcgggtcagg caagacgacc atcgcaaccc 960 
atctagcccg cgccctgcaa ctcgccgggg ccgatgttct gttagtcgat tccgatcccc 1020 
agggcagtgc ccgcgattgg gcggccgtgc gggaagatca accgctaacc gttgtcggca 1080 
tcgaccgccc gacgattgac cgcgacgtga aggccatcgg ccggcgcgac ttcgtagtga 1140 
tcgacggagc gccccaggcg gcggacttgg ctgtgtccgc gatcaaggca gccgacttcg 12 00 
tgctgattcc ggtgcagcca agcccttacg acatatgggc caccgccgac ctggtggagc 1260 
tggttaagca gcgcattgag gtcacggafcg gaaggctaca agcggccttt gtcgtgtcgc 1320 
gggcgatcaa aggcacgcgc atcggcggtg aggttgccga ggcgctggcc gggtacgagc 1380 
tgcccattct tgagtcccgt atcacgcagc gcgtgagcta cccaggcact gccgccgccg 1440 
gcacaaccgt tcttgaatca gaacccgagg gcgacgctgc ccgcgaggtc caggcgctgg 1500 
ccgctgaaat taaatcaaaa ctcatttgag ttaatgaggt aaagagaaaa tgagcaaaag 1560 
cacaaacacg ctaagtgccg gccgtccgag cgcacgcagc agcaaggctg caacgttggc 1620 
cagcctggca gacacgccag ccatgaagcg ggtcaacttt cagttgccgg cggaggatca 1680 
caccaagctg aagatgtacg cggtacgcca aggcaagacc attaccgagc tgctatctga 1740 
atacatcgcg cagctaccag agtaaatgag caaatgaata aatgagtaga tgaattttag 1800 
cggctaaagg aggcggcatg gaaaatcaag aacaaccagg caccgacgcc gtggaatgcc 1860 
ccatgtgtgg aggaacgggc ggttggccag gcgtaagcgg ctgggttgtc tgccggccct 1920 
gcaatggcac tggaaccccc aagcccgagg aatcggcgtg agcggtcgca aaccatccgg 1980 
cccggtacaa atcggcgcgg cgctgggtga tgacctggtg gagaagttga aggccgcgca 2040 
ggccgcccag cggcaacgca tcgaggcaga agcacgcccc ggtgaatcgt ggcaagcggc 2100 
cgctgatcga atccgcaaag aatcccggca accgccggca gccggtgcgo cgtcgattag 2160 
gaagccgccc aagggcgacg agcaaccaga ttttttcgtt ccgatgctct atgacgtggg 22 20 
cacccgcgat agtcgcagca tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg 2280 
acgagctggc gaggtgatcc gctacgagct tccagacggg cacgtagagg tttccgcagg 2340 
gccggccggc atggccagtg tgtgggatta cgacctggta ctgatggcgg tttcccatct 24 00 
aaccgaatcc atgaaccgat accgggaagg gaagggagac aagcccggcc gcgtgttccg 24 60 
tccacacgtt gcggacgtac tcaagttctg ccggcgagcc gatggcggaa agcagaaaga 2520 
cgacctggta gaaacctgca ttcggttaaa caccacgcac gttgccatgc agcgtacgaa 2580 
gaaggccaag aacggccgcc tggtgacggt atccgagggt gaagccttga ttagccgcta 2640 
caagatcgta aagagcgaaa ccgggcggcc ggagtacatc gagatcgagc tagctgattg 2700 
gatgtaccgc gagatcacag aaggcaagaa cccggacgtg ctgacggttc accccgatta 27 60 
ctttttgatc gatcccggca tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg 2820 
caaggcagaa gccagatggt tgttcaagac gatctacgaa cgcagtggca gcgccggaga 28 80 
gttcaagaag ttctgtttca ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta 2940 
cgatttgaag gaggaggcgg ggcaggctgg cccgatccta gtcatgcgct accgcaacct 3000 
gatcgagggc gaagcatccg ccggttccta atgtacggag cagatgctag ggcaaattgc 30 60 
cctagcaggg gaaaaaggtc gaaaaggtct ctttcctgtg gatagcacgt acattgggaa 3120 
cccaaagccg tacattggga accggaaccc gtacattggg aacccaaagc cgtacattgg 3180 
gaaccggtca cacatgtaag tgactgatat aaaagagaaa aaaggcgatt tttccgccta 3240 
aaactcttta aaacttatta aaactcttaa aacccgcctg gcctgtgcat aactgtctgg 3300 
ccagcgcaca gccgaagagc tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg 3360 
ccccgccgct tcgcgtcggc ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg 3 420 
gccaggcaat ctaccagggc gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc 34 80 
acatcaaggc accctgcctc gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc 354 0 
agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga caagcccgtc 3600 
agggcgcgtc agcgggtgtt ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg 3660 
atagcggagt gtatactggc ttaactatgc ggcatcagag cagattgtac tgagagtgca 3720 
ccatatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcaggcgctc 3780 
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ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 3840 
agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 3900 
catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 3960 
tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 4020 
gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 4080 
ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cfctcgggaag 4140 
cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 4200 
caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 426 0 
ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 4320 
taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 4380 
taactacggc tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac 4440 
cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 4500 
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 4560 
gatcttttct. acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt: 4620 
catgcatgat atatctccca atttgtgtag ggcttattat gcacgcttaa aaataataaa 4680 
agcagacttg acctgatagt ttggctgtga gcaattatgt gcttagtgca tctaacgctt 4740 
gagttaagcc gcgccgcgaa gcggcgtcgg cttgaacgaa tttctagcta gacattattt 4800 
gccgactacc ttggtga-tct cgcctttcac gtagtggaca aattcttcca actgatctgc 4 860 
gcgcgaggcc aagcgatctt cttcttgtcc aagataagcc fcgtctagctt caagtatgac 4 920 
gggctgatac tgggccggca ggcgctccat tgcccagtcg gcagcgacat ccttcggcgc 4980 
gattttgccg gttactgcgc tgtaccaaat gcgggacaac gtaagcacta catttcgctc 5040 
atcgccagcc cagtcgggcg gcgagttcca tagcgttaag gtttcattta gcgcctcaaa 5100 
tagatcctgt tcaggaaccg gatcaaagag ttcctccgcc gctggaccta ccaaggcaac 5160 
gctatgttct cttgctt/ttg tcagcaagat agccagatca atgtcga-tcg tggctggctc 5220 
gaagatacct gcaagaatgt cattgcgctg ccattctcca aattgcagtt cgcgcttagc 5280 
tggataacgc cacggaa-tga tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag 5340 
aatctcgctc tctccagggg aagccgaagt ttccaaaagg tcgttgatca aagctcgccg 54 00 
cgttgtttca tcaagcctta cggtcaccgt aaccagcaaa tcaatatcac tgtgtggctt 5460 
caggccgcca tccactgcgg agccgtacaa atgtacggcc agcaacgtcg gttcgagatg 5520 
gcgctcgatg acgccaacta cctctgatag ttgagtcgat. acttcggcga tcaccgcttc 5580 
ccccatgatg tttaactttg ttrttagggcg actgccctgc tgcgtaacat cgttgctgct 564 0 
ccataacatc aaacatcgac ccacggcgta acgcgcttgc tgcttggatg cccgaggcat 5700 
agactgtacc ccaaaaaaac agtcataaca agccatgaaa accgccactg cgttccatgg 57 60 
acatacaaat ggacgaacgg ataaaccttt tcacgccctt ttaaatatcc gattattcta 5820 
ataaacgctc ttttctctta ggtttacccg ccaatatatc ctgtcaaaca ctgatagttt 5880 
aaactgaagg cgggaaacga caatcagatc tagtaggaaa cagctatgac catgattacg 594 0 
ccaagcttgc atgcctgcag gtcgactcta gactagtgga tccgatatcg cccgggctcg 6000 
aggtacccat cgcgatcccc gtcaccggtg tgagggaact agttttgatc ttgaaagatc 6060 
ttttatcttt agagttaaga actctttcgt attttggtga ggttttatcc tcttgagttt 6120 
tggtcataga cctattcatg gctctgatac caatttttaa gcgggggctt atgcggatta 6180 
tttcttaaat tgataagggg ttattagggg gtatagggta taaatacaag cattccctta 6240 
gcgtatagta taagtatagt agcgtacctc tatcaaattt ccatcttctt accttgcaca 6300 
gggcctgcaa ccttatcct-t ccttgtcttc ctccttcctt. ccgtccactt catcatattt 6360 
aaaccaaacc tacgggggag tcaacgtaac caaccctgcc ttagcatctt ttccctaacg 64 20 
gcctcctgcc taagcggtac ttctagcttc gaacggcgtc tgggctccag gtttagtcgt 6480 
ctcgtgtctg gtttatat.tc acgacaaaga tctataggga ctttaggaga tctggatttt 6540 
agtactggat tttggtttta ggaattagaa attttattga tagaagtatt ttacaaatac 6600 
aaatacatac taagggtttc ttatatgctc aacacatgag cgaaacccta taagaaccct 6660 
aatttccctt atcgggaaac tactcacaca ttatttatgg agaaaataga gagagataga 6720 
tttgtagaga gagactggtg at/ttcagcgt accgaattcg attttcggct aacagtgtcg 6780 
aataacgctt tacaaacaat tattaacgcc cggttaccag gcgaagaggg gctgtggcag 6840 
attcatctgc aggacggaaa aatcagcgcc attgatgcgc aatccggcgt gatgcccata 6 900 
actgaaaaca gcctggatgc cgaacaaggt ttagttatac cgccgtt"tgt ggagccacat 6960 
attcacctgg acaccacgca aaccgccgga caaccgaact ggaatcagtc cggcacgctg 7020 
tttgaaggca ttgaacgctg ggccgagcgc aaagcgttat taacccatga cgatgtgaaa 7 080 
caacgcgcat ggcaaacgct gaaatggcag attgccaacg gcattcagca tgtgcgtacc 7140 
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catgtcgatg tttcggatgc aacgctaact gcgctgaaag caatgctgga agtgaagcag 7200 
gaagtcgcgc cgtggattga tctgcaaatc gtcgccttcc ctcaggaagg gattttgtcg 7260 
gatccggtga tacctgcaca tcaacaaatt ttggtcatat attagaaaag ttataaatta 7320 
aaatatacac acttataaac tacagaaaag caattgctat atactacatt cttttatttt 7380 
gaaaaaaata tttgaaatat tatattacta ctaattaatg ataattatta tatatatatc 7440 
aaaggtagaa gcagaaactt acgtagtcga cgacaaaatc ccgtcctgag ggaaggcgac 7500 
gatttgcaga tcaatccacg gcgcgacttc ctgcttcact tccagcattg ctttcagcgc 7560 
agttagcgtt gcatccgaaa catcgacatg ggtacgcaca tgctgaatgc cgttggcaat 7620 
ctgccatttc agcgtttgcc atgcgcgttg tttcacatcg tcatgggtta ataacgcttt 7680 
gcgctcggcc cagcgttcaa tgccttcaaa cagcgtgccg gactgattcc agttcggttg 774 0 
tccggcggtt tgcgtggtgt ccaggtgaat atgtggctcc acaaacggcg gtataactaa 7800 
accttgttcg gcatccaggc tgttttcagt tatgggcatc acgccggatt gcgcatcaat 7860 
ggcgctgatt tttccgtcct gcagatgaat ctgccacagc ccctcttcgc ctggtaaccg 7920 
ggcgttaata attgfcttgta aagcgttatt cgacactgtt agccaagctt gcafcgcctgc 7980 
aggtcgactc tagaggatcc ccgatccact cgagtctttg ttttttactt tggttcatga 8040 
cactcagaga cttgagagaa gcaatatata gactttttfct tgtttttttt tfcgtggtcac 8100 
gtttattttc ctattggaga cggtaacgaa gatcgaacct gtggtggaaa tgaaacmagg 8160 
tgggactagc ccacgtggtt tcttttctct gcattgattt gtttttgttt tttytgtaaa 8220 
gttcacatca aacctactaa taattgagaa gaaaaataaa atctattgat tgattaaacc 8280 
agccgatgct ttatgtctga atataaaaaa gaagtgaaaa ccccgtttaa gaattacaac 8340 
ggtggtttac aaagtatttg gacacaataa atccaaacga aataaaacaa aatggagaac 84 00 
taccaaataa aaaacaaata aaaaacttaa aagaatttat tccatttttt ttcccgtaga 84 60 
atttattctt ttatggattc cttaaatcca tatttgatgc attttgattc ctcataatag 8520 
gtaataatat atactatgtt atagatatgt ttctaattcg tattaaccta cctttttttg 8 580 
gtcgtacgat tctacctaat aatattgaac ggaattgatg ttttggacca cttagaaagt 8640 
attttttttt tggtttgtct tagctgtatt tcattaaata taaatttaaa taagaaatgt 87 00 
cataaataaa atttgacgta tagatttttt aaatccattt tatgttattt aatatttgaa 8760 
atgtgagttt ggctcctatt taatcttagg atgggttaat actaagtttt ccttaatgaa 8 820 
ttatctcaga gaaactggat taaataaact aaaaaataga tcaatgtgtt ttggtccggt 8 8 80 
caaatatctt tggatttact attattggcg aaaagaaagt ct.catat.agt aaatcatatt 8 940 
cctacaagag aaatcaaaat ttttgaatta acatggattg tatagtttct tatataacca 9000 
attagttcgc atcaagaaaa ccaaacccca attaataatc aaacgggctt ggtaggaata 9060 
tttcattgca gctttcagat aaaagaaaaa aacacacact caagtctttt atttcatctt 9120 
tcttacttgc aggaactcaa attccacttt gccacttttc tttacaaata aacacaaatt 9180 
gtcaatgaaa cgaaatagtc tttttatgca aacactgttt gtcttttttc gatcacgttt 9240 
ctgattgtga cagccatcca tatatatagg gaatgtaaaa caacaacatg tgaagtcaca 93 00 
-tatacgtaat ggtttagcat agcttctatt ttcgttgtca atattagtca ttccaaaaca 9 360 
tttttaagaa aaataaatta atatatgtat attcttggaa ctaatgtatg tggaaataca 942 0 
gtaacttaat tattaaacat tctaaatgca aatatgcaaa gaaaaaaaag aaaagaacac 94 80 
aactgaaatc aaagccagat tcataataat tggctacatg gttgtagaat gtagggtaac 9 540 
acaacatcca gaattgaaca ctcaaattgg atgatagatg gataatcttt agatacaaga 9 600 
gaattggttc tcttccatta ttaacgaaaa taaagaaaaa aagtttagca taaaagtttg 9 66 0 
aaactcaaca taacattttg aacttgactc cttcatagga gtgacatgaa ctgacgaatc 9720 
acaaccgatt acttgtttga gtcatcttcc gctttctcca ccttcgaaat gaatgtgacc 9780 
ggtttcttcg ggtgctcatt tacggtcaag tgtaaaacat ctggtxrtcga gtaatgtcca 984 0 
accgaatcga agtacaactt agctcttgct acatcaccaa gatcttgatg ggggatcggg 9900 
taccgagctc gaattcactg gccgtcgttt tacaacgact cagcacgcgt tggtttcgac 9960 
aaaatttaga acgaacttaa ttatgatctc aaatacattg atacatatct catctagatc 10020 
taggttatca ttatgtaaga aagttttgac gaatatggca cgacaaaatg gctagactcg 10080 
atgtaattgg tatctca-act caacattata cttataccaa acattagtta gacaaaattt 1014 0 
aaacaactat tttttatgta tgcaagagtc agcatatgta taattgattc agaatcgttt 10200 
tgacgagttc ggatgtagta gtagccatta tttaatgtac atactaatcg tgaatagtga 10260 
atatgatgaa acattgtatc ttattgtata aatatccata aacacatcat gaaagacact 10320 
ttctttcacg gtctgaatta attatgatac aattctaata gaaaacgaat taaattacgt 10380 
tgaattgtat gaaatctaat tgaacaagcc aaccacgacg acgactaacg ttgcctggat 10440 
tgactcggtt taagttaacc actaaaaaaa cggagctgtc atgtaacacg cggatcgagc 10500 
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aggtcacagt catgaagcca tcaaagcaaa agaactaatc caagggctga gatgattaat 10560 
tagtttaaaa attagttaac acgagggaaa aggctgtctg acagccaggt cacgttatct 10620 
ttacctgtgg tcgaaatgat tcgtgtctgt cgattttaat tatttttttg aaaggccgaa 10680 
aataaagttg taagagataa acccgcctat ataaattcat atattttcct ctccgctttg 10740 
aattgtctcg ttgtcctcct cactttcatc agccgttttg aatctccggc gacttgacag 10800 
agaagaacaa ggaagaagac taagagagaa agtaagagat aatccaggag attcattctc 10860 
cgtt -ttgaat c ttcctcaat ctcatcttct tccgctcttt ctttccaagg taataggaac 10920 
tttc-tggatc tactttattt gctggatctc gatcttgttt tctcaatttc cttgagatct 10980 
ggaattcgtt -taatttggat ctgtgaacct ccactaaatc ttttggtttt: actagaatcg 11040 
atctaagttg accgatcagt tagctcgatt atagctacca gaatttggct tgaccttgat 11100 
ggagagatcc atgttcatgt tacctgggaa atgatttgta tatgtgaatt gaaatctgaa 11160 
ctgttgaagt tagafctgaat ctgaacactg tcaatgttag attgaatctg aacactgttt 11220 
aaggttagat gaagtttgtg tetagattct tcgaaacttt aggatttgta gtgtcgtacg 112 80 
ttgaacagaa agctatttct gattcaatca gggtttattt gactgtattg aactcttttt 11340 
gtgtgtttgc agctcataaa aaaaacgcga acctgcaggc atggcggcgg caacaacaac 114 00 
aacaacaaca tcttcttcga tctccttctc caccaaacca tctccttcct cctccaaatc 114 60 
accattacca atctccagat tctccctccc attctcccta aaccccaaca aatcatcctc 11520 
ctcctcccgc cgccgcggta tcaaatccag ctctccctcc tccatctccg ccgtgctcaa 11580 
cacaaccacc aatgtcacaa ccactccctc tccaaccaaa cctaccaaac ccgaaacatt 11640 
catctcccga ttcgctccag atcaaccccg caaaggcgct gatatcctcg tcgaagcttt 117 00 
agaacgtcaa ggcgtagaaa ccgtattcgc ttaccctgga ggtgcatcaa tggagattca 11760 
ccaagcctta acccgctctt cctcaatccg taacgtcctt cctcgtcacg aacaaggagg 11820 
tgtattcgca gcagaaggat acgctcgatc ctcaggtaaa ccaggtatct gtatagccac 11880 
ttcaggtccc ggagctacaa atctcgttag cggattagcc gatgcgttgt tagatagtgfc 1194 0 
tcctcttgta gcaatcacag gacaagtccc tcgtcgtatg attggtacag atgcgtttca 12000 
agagactccg attgttgagg taacgcgttc gattacgaag cataactatc ttgtgatgga 12060 
tgttgaagat atccctagga ttattgagga agctttcttt ttagctactt ctggtagacc 12120 
tggacctgtt ttggttgatg ttcctaaaga tattcaacaa cagcttgcga tfccctaattg 1218 ° 
ggaacaggct atgagattac ctggttatat gtctaggatg cctaaacctc cggaagattc 1224 0T 
tcatttggag cagattgtta ggttgatttc tgagtctaag aagcctgtgt tgtatgttgg 12300 
tggtggttgt ttgaattcta gcgatgaatt gggtaggttt gttgagctta cggggatccc 12360 
tgttgcgagt acgttgatgg ggctgggatc ttatccttgt gatgatgagt tgtcgttaca 124 20 
tatgcttgga atgcatggga ctgtgtatgc aaattacgct gtggagcata gtgatttgtt 12480 
gttggcgttt ggggtaaggt ttgatgatcg tgtcacgggt aagcttgagg cttttgctag 12540 
tagggctaag attgttcata ttgatattga ctcggctgag attgggaaga ataagactcc 12600 
tcatgtgtct gtgtgtggtg atgttaagct ggctttgcaa gggatgaata aggttcttga 12660 
gaaccgagcg gaggagctta agcttgattt tggagtrttgg aggaatgagt tgaacgtaca 12720 
gaaacagaag tttccgttga gctttaagac gtttggggaa gctattcctc cacagtatgc 12780 
gattaaggtc cttgatgagt tgactgatgg aaaagccata ataagtactg gtgtcgggca 12 840 
acatcaaatg tgggcggcgc agttctacaa ttacaagaaa ccaaggcagt. ggcrtatcatc 12900 
aggaggcctt ggagctatgg gatttggact tcctgctgcg attggagcgt ctgttgctaa 12960 
ccctgatgcg atagttgtgg atattgacgg agatggaagc tttataatga atgtgcaaga 13020 
gctagccact attcgtgtag agaatcttcc agtgaaggta cttttattaa acaaccagca 13080 
tcttggcatg gttatgcaat gggaagatcg gttctacaaa gctaaccgag ctcacacatt 13140 
tctcggggat ccggctcagg aggacgagat attcccgaac atgttgctgt ttgcagcagc 13 200 
ttgcgggatt ccagcggcga gggtgacaaa gaaagcagat ctccgagaag cta-ttcagac 13260 
aatgctggat acaccaggac cttacctgtt ggatgtgatt tgtccgcacc aagaacatgt 13320 
gttgccgatg atcccgaatg gtggcacttt caacgatgtc ataacggaag gagatggccg 13 380 
gattaaatac tgagagatga aaccggcctg gccggcccgg agtggggagg cacgatggcc 13440 
gctttggtcg atcgacggga tcgatcctgc tttaatgaga tatgcgagac gcctatgatc 13 500 
gcatgatatt tgctttcaat tctgttgtgc acgttgtaaa aaacctgagc atgtgtagct 13560 
cagatcctta ccgccggttt cggttcattc taatgaatat atcacccgtt actatcgtat 13620 
ttttatgaat aatattctcc gttcaattta ctgattgtac cctactactt atatgtacaa 13680 
tattaaaatg aaaacaatat attgtgctga ataggtttat agcgacatct atgatagagc 13740 
gccacaataa caaacaattg cgttttatta ttacaaatcc aattttaaaa aaagcggcag 13800 
aaccggtcaa acctaaaaga ctgattacat aaatcttatt caaatttcaa aaggccccag 13860 



* 
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gggctagtat ctacgacaca ccgagcggcg aactaataac gttcactgaa gggaactccg 13920 
gttccccgcc ggcgcgcatg ggtgagattc cttgaagttg agtattggcc gtccgctcta 13980 
ccgaaagtta cgggcaccat tcaacccggt ccagcacggc ggccgggtaa ccgacttgct 14040 
gccccgagaa ttatgcagca tttttttggt gtatgtgggc cccaaatgaa gtgcaggtca 14100 
aaccttgaca gtgacgacaa atcgttgggc gggtccaggg cgaattttgc gacaacatgt 14160 



cgaggctcag caggatgggc ccag 

<210> 59 
<211> 1011 
<212> DNA 
<213> Zea mays 

<220> 

<221> CDS 

<222> (1)..(981) 

<223> coding for 5-methylthioribose kinase 
<400> 59 

gca cga gca etc etc tec tct cct etc gee ggc gca teg ccc gac tgt 
Ala Arg Ala Leu Leu Ser Ser Pro Leu Ala Gly Ala Ser Pro Asp Cys 
15 10 15 

cag tea gec tea gec atg gee gcg gag gag gag cag ggc ttc cgc ccg 
Gin Ser Ala Ser Ala Met Ala Ala Glu Glu Glu Gin Gly Phe Arg Pro 

20 25 30 

ctg gac gag teg tec ctg etc gec tac ate aag gec acg ccg gcg etc 
Leu Asp Glu Ser Ser Leu Leu Ala Tyr lie Lys Ala Thr Pro Ala Leu 
35 40 45 

gec tec cgc etc ggc ggc ggt ggc agt eta gac tec ate gag ate aag 
Ala Ser Arg Leu Gly Gly Gly Gly Ser Leu Asp Ser lie Glu lie Lys 
50 55 60 

gag gtc ggc gac ggc aac etc aac ttc gtc tac ate gtg cag tec gag 
Glu Val Gly Asp Gly Asn Leu Asn Phe Val Tyr lie Val Gin Ser Glu 
65 70 75 80 

gec ggc gec ate gtc gtc aag cag gcg etc ccg tac gtg cgc tge gtg 
Ala Gly Ala lie Val Val Lys Gin Ala Leu Pro Tyr Val Arg Cys Val 

85 90 95 

ggg gat teg tgg ccc atg acg egg gag cgc gee tac ttc gag gee tec 
Gly Asp Ser Trp Pro Met Thr Arg Glu Arg Ala Tyr Phe Glu Ala Ser 

100 105 110 

acg ctg egg gag cac ggc cgc ctg tgc ccg gag cac ace ccc gag gtg 
Thr Leu Arg Glu His Gly Arg Leu Cys Pro Glu His Thr Pro Glu Val 
115 " 120 125 



tac cac ttc gac egg acc ttg 
Tyr His Phe Asp Arg Thr Leu 
130 135 



ctg atg ggg atg cgc tac ate gag 
Leu Met Gly Met Arg Tyr lie Glu 

140 



ccc ccg cac ate ate etc cgc aag ggc etc gtc gec ggt gtc gag tac 

Pro Pro His lie He Leu Arg Lys Gly Leu Val Ala Gly Val Glu Tyr 

145 150 155 160 

ccg ctg etc gee gac cac atg tec gat tac atg gee aag acg etc ttc 

Pro Leu Leu Ala Asp His Met Ser Asp Tyr Met Ala Lys Thr Leu Phe 

165 170 175 

ttc acc tec etc etc tat aac aat ace acg gat cat aag aac gga gtt 
Phe Thr Ser Leu Leu Tyr Asn Asn Thr Thr Asp His Lys Asn Gly Val 

180 185 190 



14184 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 
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get aag tac tct gcg aac gtg gag atg tgt agg etc acg gag caa gtt 
Ala Lys Tyr Ser Ala Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val 
195 200 205 

qtg ttc teg gac cca tac cgt gtt tec aaa ttt aat egg tgg acc teg 
Val Phe Ser Asp Pro Tyr Arg Val Ser Lys Phe Asn Arg Trp Thr Ser 

210 215 220 

cct tat etc gac aaa gat get gag gca gtt cgc gag gat gat gag etc 
Pro Tvr Leu Asp Lys Asp Ala Glu Ala Val Arg Glu Asp Asp Glu Leu 
225 230 235 240 

aag ttg gaa gta get ggg ctg aaa teg atg ttt ate gag aga get caa 
Lys Leu Glu Val Ala Gly Leu Lys Ser Met Phe lie Glu Arg Ala Gin 

245 250 255 

get ctg att cat gga gat etc cac act ggt tct ate atg gtg acc gaa 
Ala Leu lie His Gly Asp Leu His Thr Gly Ser He Met Val Thr Glu 

260 265 270 

gtt caa etc aag tea ttg ate cag aat ttg ggt tct atg ggg cca atg 
Val Gin Leu Lys Ser Leu He Gin Asn Leu Gly Ser Met Gly Pro Met 
275 280 285 

ggg ttt gat att ggg age ctt cct tgg aaa cct gat ttt ggg cat act 
Gly Phe Asp He Gly Ser Leu Pro Trp Lys Pro Asp Phe Gly His Thr 

290 295 300 

atg cac aga atg ggc atg ctg ate aag cga atg ate gta agg ctt aca 
Met His Arg Met Gly Met Leu He Lys Arg Met He Val Arg Leu Thr 
305 310 315 320 

aga atg gat ctt gaa gac aat tgaagagtcg tggaatttgt tccacaaaaa 
Arg Met Asp Leu Glu Asp Asn 325 

<210> 60 
<211> 327 
<212> PRT 
<213> Zea mays 

<400> 60 ^ 
Ala Arg Ala Leu Leu Ser Ser Pro Leu Ala Gly Ala Ser Pro Asp Cys 

1 5 10 15 

Gin Ser Ala Ser Ala Met Ala Ala Glu Glu Glu Gin Gly Phe Arg Pro 

20 25 30 

Leu Asp Glu Ser Ser Leu Leu Ala Tyr He Lys Ala Thr Pro Ala Leu 
35 40 45 

Ala Ser Arg Leu Gly Gly Gly Gly Ser Leu Asp Ser He Glu He Lys 

50 55 60 

Glu Val Gly Asp Gly Asn Leu Asn Phe Val Tyr He Val Gin Ser Glu 
65 70 75 80 

Ala Glv Ala He Val Val Lys Gin Ala Leu Pro Tyr Val Arg Cys Val 

85 90 95 

Gly Asp Ser Trp Pro Met Thr Arg Glu Arg Ala Tyr Phe Glu Ala Ser 

100 105 HO 

Thr Leu Arg Glu His Gly Arg Leu Cys Pro Glu His Thr Pro Glu Val 
115 120 125 



624 



672 



720 



768 



816 



864 



912 



960 



1011 
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Tyr His Phe Asp Arg Thr Leu Ser Leu Met Gly Met Arg Tyr Zle Glu 
130 135 140 

Pro Pro His lie lie Leu Arg Lys Gly Leu Val Ala Gly Val Glu Tyr 
145 150 155 160 

Pro Leu Leu Ala Asp His Met Ser Asp Tyr Met Ala Lys Thr Leu Phe 

165 170 175 

Phe Thr Ser Leu Leu Tyr Asn Asn Thr Thr Asp His Lys Asn Gly Val 

180 185 190 

Ala Lys Tyr Ser Ala Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val 
195 200 205 

Val Phe Ser Asp Pro Tyr Arg Val Ser Lys Phe Asn Arg Trp Thr Ser 
210 215 220 

Pro Tyr Leu Asp Lys Asp Ala Glu Ala Val Arg Glu Asp Asp Glu Leu 
225 230 235 240 

Lys Leu Glu Val Ala Gly Leu Lys Ser Met Phe lie Glu Arg Ala Gin 

245 250 255 

Ala Leu He His Gly Asp Leu His Thr Gly Ser He Met Val Thr Glu 

260 265 270 

Val Gin Leu Lys Ser Leu Xle Gin Asn Leu Gly Ser Met Gly Pro Met 
275 280 285 

Gly Phe Asp lie Gly Ser Leu Pro Trp Lys Pro Asp Phe Gly His Thr 
290 295 300 

Met His Arg Met Gly Met Leu lie Lys Arg Met lie Val Arg Leu Thr 
305 310 315 320 

Arg Met Asp Leu Glu Asp Asn 

325 



<210> 61 
<211> 471 
<212> DNA 

<213> Brassica napus 

<220> 

<221> CDS 

<222> (2). .(469) 

<223> coding for 5-methylthioribose kinase 
<400> 61 

a ttt ccg ggt cga cga ttt cgt ggc aat etc aac ttc gtt ttc ate gtc 4 9 
Phe Pro Gly Arg Arg Phe Arg Gly Asn Leu Asn Phe Val Phe lie Val 
1 5 10 15 



ate gga tec act ggc tea etc gtc ate aaa cag gcg ctt ccg tat ata 

He Gly Ser Thr Gly Ser Leu Val He Lys Gin Ala Leu Pro Tyr lie 

20 25 30 

cgt tgt att ggg gag tct tgg cca atg acg aaa gaa aga get tac ttt 

Arg Cys He Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe 
35 40 45 



97 



145 



gaa get aca act ctg aga aag 
Glu Ala Thr Thr Leu Arg Lys 
50 



gga get ttg tct cct gat cat gtt 
Gly Ala Leu Ser Pro Asp His Val 

60 



193 



CA 02493364 2005-01-21 



PF 53790 

88 

_ oac aac acc atg get ttg att gga atg agg 

S ffi !S £ K £ S S? » «« L. i« a. ciy « «j 



70 75 



tat etc gag cct cct cac ate ate etc cgc aaa gga etc gtt get gga 
Tyr Leu Glu Pro Pro His He He Leu Arg Lys Gly Leu Val Ala Gly 

y 85 90 

ate eag tac cct ttc ett gea gaa cac atg get gat tac atg gee aaa 
S Gin Tyr Pro Phe Leu Ala Glu His Met Ala Asp Tyr Met Ala Lys 

100 105 

s ss £ a s a £ - a s s = = a s a 

ion a^-j 



aga gca gta acc gag ttt tgt «t aat gtg gag «j gc egg tta acg 
Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu Leu Cys Arg 

130 135 
gag eaa gta gtg ttc tet gac ceg tat aga gtt tet ag 
Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val Ser 
145 * 50 155 

<210> 62 
<211> 156 
<212> PRT 

<213> Brassica napus 



241 



289 



337 



pie°Pro 2 Gly Arg Arg Phe Arg Gly Asn Asn Phe Val Phe lie Val 

111 Gly Ser Thr Gly Ser Leu Val lie Lys Glu Ala Leu Pro Tyr He 
Arg Cys He G^y Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe 

Glu Ala Thr Thr Leu Arg Lys Bis Gly Ala Leu Ser Pro Asp His Val 

50 55 60 

Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala Leu He Gly Met Arg 

65 70 
Tyr Leu Glu Pro Pro His He He Leu Arg Lys Gly Leu Val Ala Gly 

85 90 
He Gin Tyr Pro Phe Leu Ala Glu His Met Ala Asp Tyr Met Ala Lys 

100 105 
Thr Leu Phe Phe Thr Ser Leu Leu Tyr His Asp Thr Thr Glu His Lys 



385 



Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu Leu Cys Arg Leu Thr 

130 135 
Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val Ser 
145 "0 155 



433 



471 



<210> 63 
<211> 415 
<212> DNA 

<213> Brassica napus 



CA 02493364 2005-01-21 



PF 53790 

89 

<220> 

<221> CDS 

<222> (3). .(413) 

<223> coding for 5-methylthioribose kinase 
<400> 63 

gg gtc gac gat ttc gtg ctg aga gca aaa gag atg teg ttc gat gag 47 
Val Asp Asp Phe Val Leu Axg Ala Lys Glu Met Ser Phe Asp Glu 
1 5 10 15 

ttc aag ccg ttg aac gag aaa tct eta gta gag tac ata aag gca acg 9 5 
Phe Lys Pro Leu Asn Glu Lys Ser Leu Val Glu Tyr lie Lys Ala Thr 

20 25 30 

cct gec etc tec tec agg etc gga gac aag tac gat gat ctg gtc ate 143 
Pro Ala Leu Ser Ser Arg Leu Gly Asp Lys Tyr Asp Asp Leu Val lie 

35 40 45 

aag gaa gtt gga gat ggc aat etc aac ttc gtfc ttc ate gtt gtc gga 191 
Lys Glu Val Gly Asp Gly Asn Leu Asn Phe Val Phe lie Val Val Gly 
50 55 60 

tec act ggc tea etc gtc ate aaa cag gcg ctt ccg tat ata cgt tgt 239 
Ser Thr Gly Ser Leu Val lie Lys Gin Ala Leu Pro Tyr lie Arg Cys 
65 70 75 

att gga gaa tea tgg cca atg acg aaa gaa aga get tac ttt gaa gca 287 
lie Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe Glu Ala 
80 85 90 95 

aca act ctg aga aag cac ggt ggt ttg tct ccg gat cat gtt cct gaa 335 
Thr Thr Leu Arg Lys His Gly Gly Leu Ser Pro Asp His Val Pro Glu 

100 105 110 

gtc tac cat ttt gac aga acc atg get ttg att gga atg aga tac etc 383 
Val Tyr His Phe Asp Arg Thr Met Ala Leu lie Gly Met Arg Tyr Leu 

115 120 125 

gag cct cct cac ate ate etc cgc aaa gga ct 415 
Glu Pro Pro His lie lie Leu Arg Lys Gly 
130 135 

<210> 64 
<211> 137 
<212> PRT 

<213> Brassica napus 
<400> 64 

Val Asp Asp Phe Val Leu Arg Ala Lys Glu Met Ser Phe Asp Glu Phe 
1 5 10 15 

Lys Pro Leu Asn Glu Lys Ser Leu Val Glu Tyr lie Lys Ala Thr Pro 

20 25 30 

Ala Leu Ser Ser Arg Leu Gly Asp Lys Tyr Asp Asp Leu Val lie Lys 
35 40 45 

Glu Val Gly Asp Gly Asn Leu Asn Phe Val Phe lie Val Val Gly Ser 
50 55 60 

Thr Gly Ser Leu Val lie Lys Gin Ala Leu Pro Tyr lie Arg Cys lie 
65 70 75 80 



Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe Glu Ala Thr 

85 90 95 



i 
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Thr Leu Arg Lys His Gly 6ly Leu Ser Pro Asp His Val Pro Glu Val 

100 105 
Ty r His Phe Asp Arg Thr Met Ala Leu He Gly Met Arg Tyr Leu Glu 

115 125 
Pro Pro His He He Leu Arg Lys Gly 
130 135 

<210> 65 
<211> 424 
<212> OTA 

<213> Oryza sativa 

<220> 

<221> CDS 

<222> (3). .(422) 

<223> coding for S-methylthioribose kinase 

cc°cit 6 ctc tac aac tec aoc act gat cac aag aaa gga gtt get cag 
Leu Leu £yr Asn Ser Thr Thr Asp His Lys Lys Gly Val Ala Gin 
1 5 1° " 

tac tgc gat aat gtg gag atg tgt agg etc aca gag caa gtc gtg ttc 
Tyr Cys Asp Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val Val Phe 

tea gac cca tac atg etc gec aaa tac aat cgt tgc aca tea ccc ttc 
Ser Lp Pro Tyr Met Leu Ala Lys Tyr Asn Arg Cys Thr Ser Pro Phe 

35 40 
c ta cat aat gat get gca gcg gtt cga gag gat get gag ctt aaa ttg 
III Isp Asn Asp Ala La Ila Val Arg Glu Asp Ala Glu Leu Lys Leu 
50 55 
att act aaa ttg aaa tea atg ttt att gag aga gca cag get ctt 
£! xlt lla Glu Lu Lys Ser Met Phe He Glu Arg Ala Gin Ala Leu 

65 70 75 

ctt cat gga gat etc cac act ggt tec ate atg gtg aca cca gat tct 
III His lly Lp Leu His Thr Gly Ser He Met Val Thr Pro Asp Ser 
80 8 5 90 

,.+. a ± a att oat cca gaa ttt get ttc tat ggc cca atg ggt tac 

Thr Gin Va! S Asp Pro Glu Phe La Phe Tyr Gly Pro Met Gly Tyr 

aac att ggg gec ttc ctg ggg aac ttg att ttg gca tat ttt tea caa 
Asp lie III La Phe Leu Gly Asn Leu He Leu Ala Tyr Phe Ser Gin 

115 I 20 
gat gga cac get gat caa gca aat gat cgt aag get tac aa 
Asp G?y His Ila Asp Gin Ala Asn Asp Arg Lys Ala Tyr 
130 i35 

<210> 66 
<211> 140 
<212> PRT 

<213> Oryza sativa 

^eu°Leu 6 Tyr Asn Ser Thr Thr Asp His Lys Lys Gly Val Ala Gin Tyr 
1 s 10 



47 



95 



143 



191 



239 



287 



335 



383 



424 
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Cys Asp Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val Val Phe Ser 

20 25 30 

Asp Pro Tyr Met Leu Ala Lys Tyr Asn Arg Cys Thr Ser Pro Phe Leu 
35 40 45 

Asp Asn Asp Ala Ala Ala Val Arg Glu Asp Ala Glu Leu Lys Leu Glu 
50 55 60 

lie Ala Glu Leu Lys Ser Met Phe lie Glu Arg Ala Gin Ala Leu Leu 
65 70 75 80 

His Gly Asp Leu His Thr Gly Ser lie Met Val Thr Pro Asp Ser Thr 

85 90 95 

Gin Val lie Asp Pro Glu Phe Ala Phe Tyr Gly Pro Met Gly Tyr Asp 

100 105 110 

lie Gly Ala Phe Leu Gly Asn Leu lie Leu Ala Tyr Phe Ser Gin Asp 
115 120 125 

Gly His Ala Asp Gin Ala Asn Asp Arg Lys Ala Tyr 
130 135 140 



<210> 67 

<211> 404 

<212> DNA 

<213> Glycine max 

<220> 

<221> CDS 

<222> (3). .(404) 

<223> codling for 5-methylthioribose kinase 
<400> 67 

ta ate ccc gaa cat gtt cct gaa gtg tat cac ttt gac cgt acc atg 4 7 
He Pro Glu His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met 
15 10 15 

tct ttg ate ggt atg cgt tac ttg gag ccc cca cat ata ate etc ata 95 
Ser Leu lie Gly Met Arg Tyr Leu Glu Pro Pro His He He Leu He 

20 25 30 

aaa ggg ttg att get ggg att gag tac cct ttt ttg get gaa cac atg 143 
Lys Gly Leu He Ala Gly He Glu Tyr Pro Phe Leu Ala Glu His Met 

35 40 45 

get gat ttc atg gcg aag aca etc ttc ttc acg tct ctg ctt ttc cgt 191 
Ala Asp Phe Met Ala Lys Thr Leu Phe Phe Thr Ser Leu Leu Phe Arg 
50 55 60 

tec act get gac cac aaa egg gac gtt gec gaa ttt tgt ggg aat gtg 239 
Ser Thr Ala Asp His Lys Arg Asp Val Ala Glu ?he Cys Gly Asn Val 
65 70 75 

gag tta tgc agg etc act gaa cag gtc gtt ttc tct gac cct tat aaa 2 87 
Glu Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Lys 
80 85 90 95 

gtt tct caa tat aat cgt tgg act tec ccc tat ctt gat cgt gat get 335 
Val Ser Gin Tyr Asn Arg Trp Thr Ser Pro Tyr Leu Asp Arg Asp Ala 

100 105 110 

gag get gtt egg gaa gac aat ctg ctg aag ctt gaa gtt get gag ctg 383 
Glu Ala Val Arg Glu Asp Asn Leu Leu Lys Leu Glu Val Ala Glu Leu 

115 120 125 
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aaa tec aag ttc att gag age 
Lys Ser Lys Phe lie Glu Ser 
130 

<210> 68 

<211> 134 

<212> PRT 

<213> Glycine max 

Ile°Pro 8 Glu His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met Ser 

! 5 10 15 

Leu He Gly Met Arg Tyr Leu Glu Pro Pro His He He Leu lie Lys 

20 25 3° 

Gly Leu He Ala Gly He Glu Tyr Pro Phe Leu Ala Glu His Met Ala 

35 40 45 

Asp Phe Met Ala Lys Thr Leu Phe Phe Thr Ser Leu Leu Phe Arg Ser 

50 55 60 

Thr Ala Asp His Lys Arg Asp Val Ala Glu Phe Cys Gly Asn Val Glu 
65 70 75 80 

Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Lys Val 
" 90 95 



65 



Ser Gin Tyir Asn Arg Trp Thr Ser Pro Tyr Leu Asp Arg Asp Ala Glu 

100 105 HO 

Ala Val Arg Glu Asp Asn Leu Leu Lys Leu Glu Val Ala Glu Leu Lys 
115 120 I 25 

Ser Lys Phe He Glu Ser 
130 



<210> 69 
<211> 21 
<212> DHA 

<213> Artificial sequence 

<220> . . „ 

<223> Description of the artificial sequences 

oligonucleotide primer 
<400> 69 

cgtgaatacg gcgtggagtc g 

<210> 7D 
<211> 20 
<212> DHA 

<213> Artificial sequence 

<220> . . , 

<223> Description of the artificial sequence: 

oligonucleotide primer 
<400> 70 

eggcaggata atcaggttgg 

<210> 71 
<211> 20 
<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<4 00> 71 

gtcaacgtaa ccaaccctgc 
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We claim: 

1. A process for preparing transformed plant cells or organisms, 
which comprises the following steps: 



a) 



transforming a population of plant cells, wxth the cells 
of said population containing at least one marker protein 
capable of causing directly or indirectly a toxic effect 
10 for said population, with at least one nucleic acid se- 

quence to be inserted in combination with at least one 
double-stranded marker protein ribonucleic acid sequence 
or an expression cassette or expression cassettes ensur- 
ing expression thereof capable of reducing the expression 
15 of at least one marker protein, and 

selecting transformed plant cells whose genome contains 
said nucleic acid sequence and which have a growth advan- 
tage over nontransformed cells, due to the action of said 
20 double-stranded marker protein ribonucleic acid sequence, 

from said population of plant cells, the selection being 
carried out under conditions under which the marker pro- 
tein can exert its toxic effect on the nontransformed 



25 



b) 



cells 



2 The process as claimed in claim 1, wherein the marker protein 
Is capable of converting directly or indirectly * substance X 
which is nontoxic for said population of plant cells «to^ 
0 substance Y which is toxic for said population, which process 

comprises the following steps : 

a) transforming the population of plant cells with .«t least 
} one nucleic acid sequence to be inserted m c°^«£"» 
15 with at least one double-stranded marker protein ribonu- 

cleic acid sequence or an expression cassette or 
sion cassettes ensuring expression thereof capable of re- 
ducing the expression of at least one marker protein, and 

40 b) seating said population of plant cells with the sub- 

stance X at a concentration which causes a toxic effect 
for nontransformed cells, due to the conversion by the 
marker protein, and 

45 c > selecting transformed plant cells whose genome «mtains 

said nucLic acid sequence and which have a growth advan- 
tage over nontransformed cells, due to the action of said 
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double-stranded marker protein ribonucleic acid sequence, 
from said population of plant cells, the selection being 
carried out under conditions under which the marker pro- 
tein can exert its toxic effect on the nontrans formed 
cells . 

The process as claimed in claim 2, wherein the nontoxic sub- 
stance X is a substance which does not naturally occur in 
plant cells or organisms or occurs naturally therein only at 
a concentration which can essentially not cause any toxic ef- 
fect. 

The process as claimed in claim 2 or 3, wherein the substance 
X is a substance selected from the group consisting of pro- 
herbicides, proantibiotics, nucleoside analogs, 5-f luorocyto- 
sine, auxinamide compounds, naphthalacetamide , dihaloalkanes , 
Acyclovir, Ganciclovir, 1 , 2-deoxy-2-f luoro-b-D-arabinof urano- 
sil-5-iodouracil, 6-thioxanthine , allopurinol , 6-methylpurine 
deoxyr ibonuc leos ide , 4 -aminopyr azolopyr imidine , 2 -amino-4 -me- 
thoxybutanoic acid, 5- ( trif luoromethyl) thioribose and allyl 
alcohol . 

The process as claimed in any of claims 1 to 4, wherein the 
marker protein is selected from the group consisting of cyto- 
sine deaminases, cytochrome P-450 enzymes, indoleacetic acid 
hydrolases, haloalkane dehalogenases, thymidine kinases, gua- 
nine phosphoribosyl transferases, hypoxanthine phosphoribosyl 
transferases, xanthine guanine phosphoribosyl transferases, 
purine nucleoside phosphorylases , phosphonate monoester hy- 
drolases, indoleacetamide synthases, indoleacet amide hydro- 
lases, adenine phosphoribosyl transferases, methoxinine dehy- 
drogenases, rhizobitoxin synthases, 5-methylthioribose 
kinases and alcohool dehydrogenases. 

The process as claimed in any of claims 1 to 5, wherein the 
marker protein is encoded by 

a) a sequence described by the GenBank accession number 
S56903, M32238, NC00330B, AE009419, AB016260, NC002147, 
M26950, J02224, V00470, V00467, U10247, M13422, X00221, 
M60917, U44852, M61151, AF039169, AB025110, AF212863, 
AC079674, X77943, M12196, AF172282, X04049 or AF253472 

b) a sequence according to SEQ ID NO: 2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
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7. 



10 8 



15 



9. 

20 



10 



30 



35 



40 



42, 44, 46 or 48. 

The process as claimed in any of claims 1 to 6, wherein a se- 
quence coding for a resistance to at lesat one toxin, anti- 
biotic or herbicide is introduced together with the nucleic 
acid sequence to be inserted and selection is carried out ad- 
ditionally using the toxin, antibiotic or herbicide. 

The process as claimed in any of claims 1 to 7, wherein the 
nucleic acid sequence to be inserted into the genome of the 
plant cell or of the plant organism comprises at least one 
expression cassette capable of expressing, under the control 
off a promoter functional in plant cells or in plant organ- 
isms, an RNA and/or a protein which does not cause the ex- 
pression, amount, activity and/or function of a marker pro- 
tein to be reduced. 

The process as claimed in any of claims 1 to 8, wherein the 
Plant cell is part of a plant organism or of a tissue, part, 
organ, cell culture or propagation material derived there- 

from. 

. The process as claimed in any of claims 1 to 9 for preparing 
transformed plant cells or organisms, which comprises the 
following steps: 

transforming a population of plant cells which comprises 
at least one non-endogenous (preferably non-plant) marker 
protein capable of converting directly or indirectly a 
substance X which is nontoxic for said population of 
plant cells into a substance Y which is toxic for said 
Population, with at least one nucleic acid sequence to be 
inserted in combination with at least one nucleic acid 
sequence coding for a double-stranded marker protein ri- 
bonucleic acid sequence or an expression cassette or ex- 
pression cassettes ensuring expression thereof ribonucle- 
ic acid sequence capable of reducing the expression, 
amount, activity and/or function of said marker protein, 



a) 



and 



b) treating said population of plant cells with the sub- 
stance X at a concentration which causes a toxic effect 
45 for nontransformed cells, due to the conversion by the 

marker protein, and 
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c) selecting transformed plant cells (and/or populations of 
plant cells, such as plant tissues or plants) whose ge- 
nome contains said nucleic acid sequence and which have a 
growth advantage over nontrans formed cells, due to the 

5 action of said double-stranded marker protein ribonucleic 

acid sequence, from said population of plant cells, the 
selection being carried out under conditions under which 
the marker protein can exert its toxic effect on the non- 
transformed cells, and 

10 

d) regenerating fertile plants, and 



e) eliminating by crossing the nucleic acid sequence coding 
for the marker protein and isolating fertile plants whose 
genome contains said nucleic acid sequence but does not 
contain any longer the sequence coding for the marker 
protein. 



20 11. An amino acid sequence coding for a plant 5-methylthioribose 
kinase, wherein said amino acid sequence contains at least 
one sequence selected from the group consisting of SEQ ID NO: 
60, 62, 64, 66 or 68. 



25 12. A nucleic acid sequence coding for a plant 5-methylthioribose 
kinase, wherein said nucleic acid sequence contains at least 
one sequence selected from the group consisting of SEQ ID NO: 
59, 61, 63, 65 or 67. 

30 , . 

13. A double-stranded RNA molecule, comprising 



a) a "sense" RNA strand comprising at least one ribonucleo- 
tide sequence which is essentially identical to at least 
a part of the "sense" RNA transcript of a nucleic acid 
sequence coding for a marker protein, and 



b) an "antisense" RNA strand which is essentially, prefer- 
ably fully, complementary to the RNA sense strand under 
40 a) . 



14 ♦ The double- stranded RNA molecule as claimed in claim 13, 

wherein the marker protein is defined as in any of claims 2 
to 6. 

45 

15. The double-stranded RNA molecule as claimed in either of 

claims 13 and 14, wherein the "sense" RNA strand and the "an- 
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5 



10 



15 



tisense" RNA strand are covalently linked to one another in 
• the form of an inverted repeat. 

16 A transgenic expression cassette, comprising a nucleic acid 
' sequence which codes for a double-stranded RNA molecule as 
S in any of claims 13 to 15 and which is functionally 
linked to a promoter functional in plant organisms . 

17. A transgenic vector, comprising a transgenic expression cas- 
sette as claimed in claim 16. 

18 A transgenic plant organism, comprising a double- stranded RNA 
molecule as claimed in any of claims 13 to 15, a transgenic 
egression cassette as claimed in claim 16 or a transgenic 
vector as claimed in claim 17. 

19 The transgenic plant organism as claimed in claim 18 se- 
lected from the group of plants, consisting of wheat oats, 

iiL bar i ev rye/corn, rice, buckwheat, sorghum, triti- 
cale spTlt Unseed, sugar cane, oilseed rape, cress arabi- 
dopsis, 'cabbage species, soybean, alfalfa, ^^^1 
n „ nl , t ootato, tobacco, tomato, eggplant, paprika, suntiow 
lr tagetes? lettuce, calendula, melon, pumpkin and zucchini. 

a cell, a cell culture or propa- 
a tissue, an organ, a part, a cen, a. ^ . 

gation material! derived from a transgenic plant organxsm as 
claimed in either of claims 18 and 19* 



25 

20. A 



30 



35 



40 



45 
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Clostridium tetani . 
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Klebsiella pneumoniae 
Clostridium tetani. 
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Clostridium tetani. 
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i 50 

(1 , MS Q YH TFTAHD AVAYAQO 

(1) MSRFDSHFRMETEDAILYAKE 

( 1 ) ARALLS S PLAGAS PDCQSAS AMAAEEEQGFRPLDES SLLAYIKATPALAS 

{1) MSFEEFTPIjNEKSL.VDYIKSTPAL.SS 

( 1 j VDDFVLRAKEMSFD EFKPLiNEKS LVE YI KATPAli S S 

(1) ~ 

(1) 

(1) L V A 

51 100 
(19) F AG I DNP SELVS AQ EVGDGNuNLVFKVFDRQGVS RAI VKQ ALP YVRC VGE 
(22) KLGIFDEHAKLQAEElGDGNINYVTKVirovlOTKKSV 

( 51 ) RLiGGGG SLD S I EI KEVGDGNLNFVYI VQ S EAG A- - I WKQALP YVRCVGD 
( 27 ) KIGADKSDDDLVIKEVGDGNLNFVFIVVGSSGS- -LVIKQALPYIRCIGE 
( 37 ) RXiGDKY — DDLVIKEVGDGNLNFVFIWGSTGS - -LVIKQALPYIRCIGE 

(1) ~ 

(1) " 

(51) D L EVGDGNLNFVF V G LVIKQALPYIRCIGE 

101 150 

( 69 ) SWPLTLDRARLEAQTLVAHYQHS PQHTVKIHHFDPELAVMVMEDLS -DHR 

(72) - - ELDVDRNR I EAEVLML QG ILA PGLVPKVYK YD S VMCNL SMED IS- DHR 

(99) SW PMTRERAYFEASTLREHGRI^ PEHTPEVYHFDRTLS LMGMRY I EP PH I 

(75) SWPMTKERAYFEATTLRKHGNLS PDHVPEVYHFDRTMALI GMRYLEPPH I 

( 83 ) SWPMTKERAYFEATTLRKHGGLS PDHVPEVYHFDRTMAL I GMRYLEP PH I 

(1> T PEHVP EVYHFDRTMS L I GMRYL EPPHI 

(101) SWPMT ERA EA TL HG LS PDHVPEVYHFDRTMAL I GMRYLEP PH I 

(118) IWRGELI ANVYYPQAARQLGDYI^QVLFHTSDFYLHFHEKKAQVAQFIN- 

( 119 ) NLRKELLKRNTFPS F AEH ITTF IVDTLLPTTDLVMDS GEKKDNVKKY IN - 

(149) ILRKGLVAGVEYPLL ADHMS DYMAKTLFFTS LLYNNTTDHKNGVAKYS AN 

(125) ILRKGLIAGIEYPFLADHMSDYMAKTLFFTSLLYHDTTEHRRAVTEFCGN 

(133) ILRKG — - - - - 

f 2 9 ) I LIKGLI AGI EYPFL AEHMADFMAKTLF FT S LLFRST ADHKRDVAEFCGN 

{1) IrLYNSTTDHKKGVAQYCDN 

(151) ILRKGLIA I YP ADHM DYMA TLF TSLLY T DHK VA F N 
201 

(167 > PAMC E I TEDLF FND P YQ I H ERN - - NY PAEL E ADV AALRDD AQLKLAV AAL 
(168) KDLCKISEDLVFTEPFIDYKSRNTVLEENIEFVKRQLYEDKELILEAGKL 
(199) VEMCRLTEQWFS D PYRVS KFNR -WTS FYLDKD AEAVREDDELKLEV AGL 
(175) VELCRLTEQWFSDPYRVSTFNR-WTSPYLDDDAKAVREDSALKLEIAEL 

(138) 

( 79 ) VELCRLTEQWFSDPYKVSQYNR-WTSPYLDRDAEAVREDNLLKLEVAEL 

(20) VEMCRLTEQWFSDPYMLAKYNR-CTS PFLDNDAAAVREDAELKLEIAEL 

(2 01) VELCRLTEQWFSDPY VS FNR TSPYLD DA AVRED LKLEVA L 
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KS FIE AQALIHGDLHTGSI V S ID EFAFYGPMGFDIG IG 

301 350 
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